Update README.md

88c4b4dc · Antonio Andriella · GitHub · f2e117f3 · 88c4b4dc
Unverified Commit 88c4b4dc authored 3 years ago by Antonio Andriella Committed by GitHub 3 years ago
--- a/README.md
+++ b/README.md
 ### GOAL simulator + Policy generator ### 
 #### STEPS:
-##### - 1: CREATE INITIAL USER COGNITIVE MODEL FROM DATA (human therapist and patient)
-##### - 2: CREATE ROBOT INITIAL POLICY FROM DATA (human therapist) OR UPDATE IT IF SESSION > 0
-##### - 3: RUN THE SIMULATION using the [BN_GenerativeModel](https://github.com/aandriella/BN_GenerativeModel) package
-##### - 4: GENERATE THE NEW EPISODES
-##### - 5: LEARN THE ROBOT REWARD USING MAXIMUM CAUSAL ENTROPY INVERSE REINFORCEMENT LEARNING algorithm proposed Ziebart's thesis (2010) [MaxEntropyIRL](https://github.com/aandriella/MaxEntRL)
-##### - 6: COMPUTE THE POLICY RELATED TO THAT REWARD USING VALUE ITERATION
-##### - 7: RUN A SESSION WITH THE PATIENT
-##### - REPEAT FROM 2
+###### - 1: Create initial user cognitive model from data (human therapist and patient)
+###### - 2: Create robot policy from data (human therapist) or update it if session > 0
+###### - 3: Run the simulation using the [BN_GenerativeModel](https://github.com/aandriella/BN_GenerativeModel) package
+###### - 4: Generate new episodes
+###### - 5: Learn the robot reward using Max Causal Entropy Inverse Reiforcement Learning algorithm proposed Ziebart's thesis (2010) [MaxEntropyIRL](https://github.com/aandriella/MaxEntRL)
+###### - 6: Compute the policy related to that reward using Value Iteration
+###### - 7: Run a session between the robot and the patient
+###### - Repeat from 2


 #### Package:
@@ -34,4 +34,4 @@ where:
 - user_id, id of the user 
 - with_feedback, if  [SOCIABLE](http://www.iri.upc.edu/files/scidoc/2353-Discovering-SOCIABLE:-Using-a-conceptual-model-to-evaluate-the-legibility-and-effectiveness-of-backchannel-cues-in-an-entertainment-scenario.pdf)  is used 
 - session,  the session id
- agent_objective, objective can be either challenge if we want to challenge more the user, help if we want to help more the user or finally it can neutral so we do not reshape the policy.
\ No newline at end of file
+- agent_objective, objective can be either challenge if we want to challenge more the user, help if we want to help more the user or finally it can neutral so we do not reshape the policy.