From 88c4b4dcfc60af0699a665536feee14789cddbd7 Mon Sep 17 00:00:00 2001 From: aandriella <aandriella@iri.upc.edu> Date: Thu, 5 Aug 2021 22:18:34 +0200 Subject: [PATCH] Update README.md --- README.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index f6f1315..3ade262 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,13 @@ ### GOAL simulator + Policy generator ### #### STEPS: -##### - 1: CREATE INITIAL USER COGNITIVE MODEL FROM DATA (human therapist and patient) -##### - 2: CREATE ROBOT INITIAL POLICY FROM DATA (human therapist) OR UPDATE IT IF SESSION > 0 -##### - 3: RUN THE SIMULATION using the [BN_GenerativeModel](https://github.com/aandriella/BN_GenerativeModel) package -##### - 4: GENERATE THE NEW EPISODES -##### - 5: LEARN THE ROBOT REWARD USING MAXIMUM CAUSAL ENTROPY INVERSE REINFORCEMENT LEARNING algorithm proposed Ziebart's thesis (2010) [MaxEntropyIRL](https://github.com/aandriella/MaxEntRL) -##### - 6: COMPUTE THE POLICY RELATED TO THAT REWARD USING VALUE ITERATION -##### - 7: RUN A SESSION WITH THE PATIENT -##### - REPEAT FROM 2 +###### - 1: Create initial user cognitive model from data (human therapist and patient) +###### - 2: Create robot policy from data (human therapist) or update it if session > 0 +###### - 3: Run the simulation using the [BN_GenerativeModel](https://github.com/aandriella/BN_GenerativeModel) package +###### - 4: Generate new episodes +###### - 5: Learn the robot reward using Max Causal Entropy Inverse Reiforcement Learning algorithm proposed Ziebart's thesis (2010) [MaxEntropyIRL](https://github.com/aandriella/MaxEntRL) +###### - 6: Compute the policy related to that reward using Value Iteration +###### - 7: Run a session between the robot and the patient +###### - Repeat from 2 #### Package: @@ -34,4 +34,4 @@ where: - user_id, id of the user - with_feedback, if [SOCIABLE](http://www.iri.upc.edu/files/scidoc/2353-Discovering-SOCIABLE:-Using-a-conceptual-model-to-evaluate-the-legibility-and-effectiveness-of-backchannel-cues-in-an-entertainment-scenario.pdf) is used - session, the session id -- agent_objective, objective can be either challenge if we want to challenge more the user, help if we want to help more the user or finally it can neutral so we do not reshape the policy. \ No newline at end of file +- agent_objective, objective can be either challenge if we want to challenge more the user, help if we want to help more the user or finally it can neutral so we do not reshape the policy. -- GitLab