Blogs

Resources, tutorials, and notes

Bayesian Networks

Network Representation, D-Separation, Inference, Sampling

15 min read · November 13, 2025

2025 · theory · sample-posts
Curriculum Learning Methods

Task-Specific, Teacher-Guided, Self-Play, Automatic Goal Generation

13 min read · November 06, 2025

2025 · theory · sample-posts
Multi-Armed Bandit Problems

Bandit Strategies, UCB1, Bayesian UCB, Thompson Sampling

10 min read · November 04, 2025

2025 · theory prototype · sample-posts
Natural Policy Gradient Methods

General overview of TRPO and PPO

8 min read · October 26, 2025

2025 · theory · sample-posts
Deep Q-Learning

Double Q-Learning and Deep Q-Networks

10 min read · October 23, 2025

2025 · theory prototype · sample-posts
Transformer Architecture

Original design, BERT, GPT-1

16 min read · October 19, 2025

2025 · theory · sample-posts
Deep Learning

Key concepts, FFNs, Sequence Models

22 min read · October 19, 2025

2025 · theory · sample-posts
Vanilla Policy Gradient Methods

Theoretical foundations and REINFORCE

7 min read · October 12, 2025

2025 · theory · sample-posts
LLM Fine-Tuning Taxonomy

Supervised fine-tuning, reinforcement learning

2 min read · October 11, 2025

2025 · overview · sample-posts
Reinforcement Learning

Model-based and Model-free learning (Direct, TD, Q-Learning)

12 min read · October 10, 2025

2025 · theory · sample-posts
Adversarial Search Algorithms

Minimax, Expectimax, Monte Carlo Tree Search

12 min read · October 09, 2025

2025 · theory · sample-posts
Markov Decision Processes

A general overview on Markov Decision Processes (MDP)

9 min read · October 08, 2025

2025 · theory · sample-posts
a post with pseudo code

this is what included pseudo code could look like

2 min read · April 15, 2024

2024 · formatting code · sample-posts