Posts by Year (grid view)
2026
2025
Busy From Works
Hello everyone, long time no writing. The past few days have been quite busy for me. Honestly, I’ve been feeling a bit too lazy to write, but the main reason...
Long Updates
Hello everyone, long time no see. This journal is going to be a little bit long considering there are a lot of gaps that I miss since wednesday. Please bear ...
Back to back meetings
Good evening everyone. A bit tired today. A lot of meeting. Even back to back meeting since yesterday. Also some stuff happening before I leave from work. An...
Lucky weekends
PLEASE DONT READ THIS !!!! ITS BORING
Short one
Good evening, sorry there will be no post today. I felt like I am lacking of what Im going to talk about today. But for the sake of my habits, I keep writing...
Seems usual
Good evening, Another day, another chance to write for the sake of building habits. It’s starting to feel fun, not gonna lie. The thing is, I need a single r...
Happy new year
Happy New Year from My Bedroom!
2024
New year new me
Back on Track: 2 Days of Good Habits and Counting!
Long time not writing, lets start
Introduction
Day 7 Algorit.ma : Capstone Project
Fraud Prediction
Day 6 Algorit.ma : Unsupervised Learning
Day 6, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...
Day 5 Algorit.ma : Classification Model
Day 5, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...
Day 4 Algorit.ma : Regression Model
Day 4, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...
Day 3 Algorit.ma : Practical Statistics
Day 3, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...
Day 2 Algorit.ma : Data Wrangling and Visualization
Day 2, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...
Day 1 Algorit.ma : Python For Data Analysis
Day 1, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...
Creating a shortcut for Jupyter Notebook
This is a quick post on how to create a shortcut for Jupyter Notebook. In this case, you need to connect your PATH of your Python Conda. Here’s how:
Preclass Bootcamp Algorit.ma BFLP Audit
Well this post (I hope I can make it as a series) will be my personal notes and documentation of data science bootcamp session from Algorit.ma. Please notes ...
2023
Last Year Challenge
Well, this post going to be my finaly year in the Netherlands. A bit of drama here and there but let;s see what happens.
2nd year at the Netherlands
A long break between the previous post. I’ve been busy with my new job (more like training) recently so yeah I might update this blog a little bit next year ...
Farewell The Netherlands
It’s been a while I created a blog post. Apparently my last post was almost 8 months ago. But now I created a farewell post to The Netherlands because I am g...
2022
What is ChatGPT
GPT (Generative Pre-training Transformer) is a type of artificial intelligence model developed by OpenAI that can be used for tasks such as language translat...
How to get your personal Dota2 Data
This is going to be a short post. This is really interesting for me personally. As a Data Scientist and avid Dota 2 player, what could be better than doing d...
Pandas Exercise 7 : Visualization Bonus
The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.
Pandas Alternatives : Modin
Pandas library has became the “one must installed” library for data manipulation in python and is widely used by data scientist and analyst. Pandas provide a...
Pandas Exercise 7 : Visualization
The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.
LeTAD #3: Next Season DPC
Well now let’s talk about the next season DPC and the drama all around it. Of course, a lot of things happening so to start with congratulations to Tundra Es...
Pandas Exercise 6 : Stats
The continuity of my practice on Pandas exercise from guisapmora.
Pandas Exercise 5 : Merge
The continuity of my practice on Pandas exercise from guisapmora.
Pandas Exercise 4 : Apply
The continuity of my practice on Pandas exercise from guisapmora.
Pandas Exercise 3 : Grouping
The continuity of my practice on Pandas exercise from guisapmora.
Pandas Exercise 2 : Filtering and Sorting
The continuity of my practice on Pandas exercise from guisapmora.
Pandas Exercise 1 : Knowing your data
So in this exercise is we are going to use dataset from the internet to make it easier. You could download the exercise from here. I just bored and keep tryi...
LeTAD #2: Dota 2 Configuration
So for the second post, I am going to show you what is the configuration of my personal Dota2 settings. It is going to be a little bit of a long explanation ...
LeTAD #1 : Backstories about me playing this ‘game’
It’s been a long time since I wanted to make content about Dota. I like to do random analyses about Dota sometimes. I hope in this series of posts I can shar...
Python 3.11
The latest version of Python has been released on 24th October 2022 last week. The 3.11 changelog consist of a lot of bug fixes, improvements, and additional...
Web Scraping with BeautifulSoup4
The surge of available data we can find on the internet is insane. With this surge, data analytics has become a hugely important part of the way organization...
BBC Recommendation System (Content-Based)
Introduction
2021
Handling Imbalance Data
Data imbalance usually reflects an unequal distribution of classes within a dataset. In class imbalance, one trains on a dataset that contains a large number...
Python Data Visualization Guide
Creating a visualization may not as easier as it looks. Some of the visualizations may look cool but not interpret what they mean. Imagine after a hard and l...
Automated EDA Library for Python
After I reviewed my knowledge of exploratory data analysis (EDA) here, I am wondering if there is some way or a new way to understand your dataset more easil...
Principal Component Analysis
Hello guys, it’s been 3 months since my last post in Machine Learning. I’ll admit that I am a little bit rusty nowadays. Because of my interviews in some com...
2020
Introduction Neural Network
You can import the library:
MNIST with Multilayer Perceptron
In this postwe will build out a Multi Layer Perceptron model to try to classify hand written digits using TensorFlow (a very famous example).
Introduction Neural Network
Artificial Neural Network (ANN) is a computational model that is inspired by the way biological neural networks in the human brain process information. Artif...
Introduction Spark
Let’s learn how to use Spark with Python by using the pyspark library! Make sure to view the video lecture explaining Spark and RDDs before continuing on wit...
Natural Language Processing with NLTK
Natural Language Processing (NLP) is broadly defined as the automatic manipulation of natural language, like speech and text. Natural language is primarily ...
NLP Project
Welcome to the NLP Project for this section of the course. In this NLP project you will be attempting to classify Yelp Reviews into 1 star or 5 star categori...
Movie Recommender System Analysis
Welcome to the code notebook for Recommender Systems with Python. In this lecture we will develop basic recommendation systems using Python and pandas. There...
Dimensionality Reduction
The performance of machine learning algorithms can degrade with too many input variables. Having a large number of dimensions in the feature space can mean t...
K Means
Clustering is a technique widely used to find groups of observations (clusters) that share similar characteristics. This process is not driven by a specific ...
K-Means Exercise
Exercise from Jose Portilla Python for Data Science Bootcamp.
Support Vector Machine
The support vector machine is a generalization of a classifier called maximal margin classifier. The maximal margin classifier is simple, but it cannot be ap...
Support Vector Machine Exercise
Exercise from Jose Portilla Python for Data Science Bootcamp.
Random Forest
Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random ...
Decision Tree
Decision trees are very popular machine learning algorithm. They are popular because a variety of reasons, being their interpretability probably their most i...
Decision Tree and Random Forest Exercise
Exercise from Jose Portilla Python for Data Science Bootcamp.
Categorical Encoding 2
Another reference and shared post from https://www.mygreatlearning.com/blog/label-encoding-in-python/
K Nearest Neighbour
K Nearest Neighbour (KNN) works by choosing the best $k$ of neighbour. Neighbour by definition is a person living near or next door to the speaker or person ...
K Nearest Neighbour Exercise
Exercise from Jose Portilla Python for Data Science Bootcamp.
Logistic Regression
Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. It is a statistical machine learning algorithm th...
Logistic Regression Exercise
Exercise from Jose Portilla Python for Data Science Bootcamp.
EDA Regression Exercise
Today, i will try Exploratory Data Analysis and regression with insurance data from Kaggle. Let’s take a look
Linear Regression
Linear regression is useful for finding relationship between two continuous variables. Linear regression is a linear model, a model that creates a linear rel...
Linear Regression Exercise
This week I will dedicate my time to solve all exercise from Jose Portilla Python for Data Science Bootcamp.
Feature Selection
Most data nowadays is huge and massive. Dataset often comes with many irrelevant features that do not contribute much to the accuracy of your predictive mode...
Categorical Encoding
Practically, in real dataset, the dataset contain categorical value. So what is the difference between casual string value and categorical value ? Well, some...
Word Vector
Machine learning can’t process non-numeric value. Then how to process image or text data ? Before you train your image or text data, you need to transform th...
Numerical Transformer
After rescaling or normalize the data, there is another way to change the distribution of the data by transformation. There are 3 different ways to transform...
Feature Scaling
Numerical data is already digestible by machine learning or mathematical formula. But it doesn’t mean that is no longer need feature engineering or preproces...
Handling Missing Values
Missing value in your data is pretty common in real life. In fact, the chance that at least one data point is missing increases as the data set size increase...
Stocks Finance Exercise
In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice your visualization and ...
Data Capstone Exercise
For this capstone project we will be analyzing some 911 call data from Kaggle. The data contains the following fields:
Python Crash Course Exercise 6
This day i will completing data visualization with Pandas Exercise. If you want to solve it all by yourself, you can download notebooks file here and sample ...
Python Crash Course Exercise 5
This day i will completing Matplotlib Exercise. If you want to solve it all by yourself, you can download notebooks file here
Python Crash Course Exercise 4
Today i will completing Pandas Exercise using SF Salaries. If you want to solve it all by yourself, you can download notebooks file here and dataset here
Python Crash Course Exercise 3
Today i will completing Pandas Exercise using Ecommerce Purchase. If you want to solve it all by yourself, you can download notebooks file here and dataset h...
Python Crash Course Exercise 2
Today i will completing Numpy Exercise. If you want to solve it all by yourself, you can download notebooks file here
Python Crash Course Exercise
This week I will dedicate my time to solve all exercise from Jose Portilla Python for Data Science Bootcamp. Today i will completing some exercise from Pytho...
Interactive Visualization Plotly
The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, fina...
Better Visualization : Seaborn 2
Multiplot grid are general types of plots that allow you to map plot types to rows and columns of a grid, this helps you create similar plots separated by fe...
Better Visualization : Seaborn
After discussing basic visualization with Matplotlib, now let’s try another but more attractive visualization library called Seaborn. Seaborn is a Python dat...
Choosing Visualization with Matplotlib as Example
When it comes to data visualization, the first and the most critical step is to select the correct visualization for the data that you want to present. With ...
Python Matplotlib
Matplotlib is the most basic library of data visualization with Python. It created to try to replicate MatLab’s (another programming language) plotting capab...
Python Pandas 2
Continue from last post, lets continue about the features in pandas.
Python Pandas
Next, let’s discussing Pandas. Preparing the data and munging the same was the initial outcomes of python before the introduction of Panda libraries. after ...
Python Numpy
Let’s continue with Numpy. NumPy is a python library used for working with arrays. It also has functions for working in domain of linear algebra and matrices...
Python Basic
Before we dive into Pandas, Numpy and Matplotlib, let’s try remind us the basic of the python first. I wont cover all Python stuff because it took too much t...
Starting Material
Ok, now after 1 day break and dilly dally learning theory about Machine Learning and Evaluation Metric (I’m kind of regret tell the theory first because it t...
Evaluation Metric Special ROC-AUC
A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its dis...
Evaluation Metric Clustering
Clustering is an important part of the machine learning pipeline for business or scientific enterprises utilizing data science. As the name suggests, it help...
Evaluation Metric Regression
Regression task is the prediction of the state of an outcome variable at a particular timepoint with the help of other correlated independent variables. The ...
Evaluation Metric Binary Classification
Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given proble...
TOEFL WritingTemplate
Broadly speaking, you’ll get a TOEFL independent writing question based on one of the following styles:
TOEFL Speaking Template
TOEFL Speaking Question 1 (Opinion about something)
Testing and Validation
To know how good your model is to actually try it on new cases or different cases. But if your model not doing good as it expected, surely you will hesitate ...
Challenges in Machine Learning
At a some level, running machine learning systems at scale is challenging for several reasons. The systems issues are often misunderstood. Although best prac...
Types of Machine Learning 2
….. Continue from last post
Types of Machine Learning
There are many different types of machine learning system. At a high-level, machine learning is simply the study of teaching a computer program or algorithm ...
Libraries in Python for Data Science
First of all, why using Python for Data Science ? According to recent surveys by KDNugget, Python is the preferred programming language for data scientists. ...
Interesting and Open Public Data Agregator
There are hundreds of zettabytes of data available on internet, but most of them is not publicly accessed. Today, i will share where to find open public data...
Data Types
There are many different kind of data types. In this blog, i will explain these data types based on most common understanding in Data Science. Specifically i...
Structured and Unstructured Data
One of the simple ways to think about data is wether it is structured or not. Well, the first thing not all data is created equal or the same. Some data is s...
Data Science Tools
After finding a reason and motivation to start learning about Data Science, lets we talk about the Tools to practicing Data Science. Machine learning tools m...
What is Data Science
So today i will start my journey with recollecting from the internet about what is Data Science ? So what is Data Science ? Why is Data Science so popular be...
Starting New Journey
Short blog post. I hope in the future i can try to be productive. Recently, i feel bored with the Covid-19 situation. So i hope in the next few days i can ma...
2019
How to Procrastinate and still get things done by John Perry
• February 23, 1996 • By John Perry
Ulasan Startup di Indonesia (Mamikos)
Mencari info kost adalah sesuatu yang wajib dilakukan oleh anak rantau. Melanjutkan pendidikan di luar kota atau bekerja di luar kota tentunya sudah bukan ha...
2017
0000 Belajar Algoritma dan Struktur data : Pengenalan Python 3
Pagi gan, semoga udah pada bangun. Jadi untuk mengisi kegabutan saya di kampus, pada hari ini saya akan mulai posting tentang algoritma dan struktur data. Ka...
0010 Belajar Machine Learning : Matplotlib
Midnight post lagi -_-. Enaknya bahas apaan ya ??? Karena bakal nggak asik kalo ML tanpa illustrasi ( ͡° ͜ʖ ͡°), mendingan bahas tentang illustrasi yang bisa...
0001 Belajar Machine Learning : Pandas
Midnight post nih gan mumpung lagi gabut. Pikir-pikir enaknya lanjut bahas ML kayak kemaren ( ͡° ͜ʖ ͡°). Pandas adalah semacam library dari Python yang biasa...
0000 Belajar Machine Learning : Hello Machine Learning
Machine Learning adalah studi tentang software yang menggunakan pengalaman masa lalu untuk membuat keputusan di masa depan. Tujuan dasar dari Machine Learnin...
Hello World
Syntax highlighting is a feature that displays source code, in different colors and fonts according to the category of terms. This feature facilitates writin...