LLM Fine-Tuning Taxonomy
Conceptual Table
| Concept | Axis | Main Question |
|---|---|---|
| Offline vs. Online (RL) | Planning (MDP) vs. Learning (RL) | Do we already know the environment? |
| Model-based vs. Model-free | Environment model | Do we know/approximate ĤT, ĤR or learn directly from interactions? |
| Value-based vs. Policy-based | Representation | Do we learn a value function or a policy directly? |
| Passive vs. Active | Exploration | Are we evaluating or improving a policy? |
Taxonomy
Enjoy Reading This Article?
Here are some more articles you might like to read next: