From 9788cd38114053ff2ee81b2051d6d5b711d279c5 Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Tue, 15 Jun 2021 12:34:07 -0400 Subject: [PATCH] edits to RL homepage --- 8-Reinforcement/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/8-Reinforcement/README.md b/8-Reinforcement/README.md index 7192636564..499f655170 100644 --- a/8-Reinforcement/README.md +++ b/8-Reinforcement/README.md @@ -14,8 +14,8 @@ In previous sections, you have seen two example of machine learning problems: -* **Supervised**, where we had some datasets that show sample solutions to the problem we want to solve. [Classification][Classification] and [regression][Regression] are supervised learning tasks. -* **Unsupervised**, in which we do not have training data. The main example of unsupervised learning is [clustering][Clustering]. +* **Supervised**, where we had some datasets that show sample solutions to the problem we want to solve. [Classification](../4-Classification/README.md) and [Regression](../2-Regression/README.md) are supervised learning tasks. +* **Unsupervised**, in which we do not have training data. The main example of unsupervised learning is [Clustering](../5-Clustering/README.md). In this section, we will introduce you to a new type of learning problems, which do not require labeled training data. There are a several types of such problems: @@ -24,9 +24,9 @@ In this section, we will introduce you to a new type of learning problems, which Suppose, you want to teach computer to play a game, such as chess, or [Super Mario](https://en.wikipedia.org/wiki/Super_Mario). For computer to play a game, we need it to predict which move to make in each of the game states. While this may seem like a classification problem, it is not - because we do not have a dataset with states and corresponding actions. While we may have some data like that (existing chess matches, or recording of players playing Super Mario), it is likely not to cover sufficiently large number of possible states. -Instead of looking for existing game data, **reinforcement learning** (RL) is based on the idea of *making the computer play* many times, observing the result. Thus, to apply reinforcement learning, we need two things: +Instead of looking for existing game data, **Reinforcement Learning** (RL) is based on the idea of *making the computer play* many times, observing the result. Thus, to apply Reinforcement Learning, we need two things: 1. **An environment** and **a simulator**, which would allow us to play a game many times. This simulator would define all game rules, possible states and actions. -2. **A reward function**, which would tell us how good we did during each move or game. +2. **A reward function**, which would tell us how well we did during each move or game. The main difference between supervised learning is that in RL we typically do not know whether we win or lose until we finish the game. Thus, we cannot say whether a certain move alone is good or now - we only receive reward at the end of the game. And our goal is to design such algorithms that will allow us to train a model under such uncertain conditions. We will learn about one RL algorithm called **Q-learning**.