Skip to content

borchand/Bachelor-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

350 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bachelor-Project

Getting Started

The code in this repository uses python 3.10 or 3.11 and linux as operating system

Some tasks from OpenAI Gym uses Box2D. For this to work you need to isnstall swig. You can do this using brew with the following command:

brew install swig

To initialize submodules run this command

git submodule update --init --recursive

When you have swig installed, you can install the required packages using the following command:

pip install -r requirements.txt

Errors

If you get Attribute error

AttributeError: module '_Box2D' has no attribute 'RAND_LIMIT_swigconstant'

Try install gymnasium wiht all, this should solve the problem

pip install gymnasium[all]

Environments

This repository contains the following environments:

  • CartPole-v1
  • MountainCar-v0
  • LunarLander-v2
  • Acrobot-v1
  • MountainCarContinuous-v0
  • Pendulum-v1

Icml State abstraction

To run the code, this will train a ppo policy with 1000 episodes and then run the experiments

python run_icml.py -a ppo -e Acrobot-v1 -pep 1000

To run a pre trained policy, you have to specify the seed

python run_icml.py -a ppo -e Acroot-v1 -pep 1000 --seed 42 -tr f 

This has training false -tr f and will load a policy that have trained for 1000 episodes with the seed 42

The run_icml.py takes the following command line parameters

  • -a --algo: str - The algorithm to train
  • -e --env: str - The gym environment to run
  • -k --k-bins: int (default=1) - The number of bins to discretize environments with continuous action spaces
  • -tr --train: bool (default=True) A bool that determines whether to train or not
  • -ex --experiement: bool (default=True), if true run the experiment
  • -ab --abstraction: bool (default=True), if true loads or creates the abstraction network
  • -ep --episodes the total amout of episodes to run for policy training and experiment training
  • -pep --policy-episodes(default=None) the number of episodes to train the expert policy
  • -eep --experiment-episodes(default=None ) the number of episodes to train the experiment
  • -r --render: bool (default=Fale), if true renders one episode of the algorithm in the environment
  • -rp --render-policy: bool (default=Fale), if true renders one episode of the expert policy in the environment
  • -re --render-experiment: bool (default=Fale), if true renders one episode of the algorithm in the environment
  • -s --save: bool (default=True), if true saves the trained model
  • -l -load: bool (default=False), if true load a trained abstraction network, with specified time-steps and algo
  • -le -load-experiment: bool (default=False), if true load a trained abstraction network, with specified time-steps and algo

CAT_RL

Running the code

To run the code, you can use the following command:

python CAT-RL.py

This will run all the environments and render the model after training.

There are also some optional arguments you can use:

  • --env or -e to specify the environment you want to run
    • default: MountainCar
    • options: MountainCar, MountainCarContinuous,CartPole, LunarLander, Acrobot, Pendulum
  • --train or -t to train the model
    • default: t (True)
    • options: t, f (True, False)
  • --render or -r to render the model
    • default: t (True)
    • options: t, f (True, False)
  • --seed or -s to specify the seed. If rendering without training, you need to set the seed of the trained model
    • default: 0
  • --verbose or -v to print the progress of the training
    • default: t (True)
    • options: t, f (True, False)
  • --help or -h to get help

For example, to run the CartPole-v1 environment without rendering the model, you can use the following command:

python CAT-RL.py -r f -e CartPole

or to just render a trained model with seed 123 from the CartPole-v1 environment, you can use the following command:

python CAT-RL.py -t f -e CartPole -s 123

Tile Coding

Running the code

To run the code, you can use the following command:

python tileCoding.py

This will run the code and render the model after training.

There are also some optional arguments you can use:

  • --env or -e to specify the environment you want to run
    • default: MountainCar
    • options: MountainCar, MountainCarContinuous,CartPole, LunarLander, Acrobot, Pendulum
  • --train or -t to train the model
    • default: t (True)
    • options: t, f (True, False)
  • --render or -r to render the model
    • default: t (True)
    • options: t, f (True, False)
  • --seed or -s to specify the seed. If rendering without training, you need to set the seed of the trained model
    • default: 0
  • --verbose or -v to print the progress of the training
    • default: t (True)
    • options: t, f (True, False)
  • --help or -h to get help

For example, to run the CartPole-v1 environment without rendering the model, you can use the following command:

python tileCoding.py -r f -e CartPole

or to just render a trained model with seed 123 from the CartPole-v1 environment, you can use the following command:

python tileCoding.py -t f -e CartPole -s 123

Bins

Running the code

To run the code, you can use the following command:

python binQlearning.py

This will run the code and render the model after training.

There are also some optional arguments you can use:

  • --env or -e to specify the environment you want to run
    • default: MountainCar
    • options: MountainCar, MountainCarContinuous,CartPole, LunarLander, Acrobot, Pendulum
  • --train or -t to train the model
    • default: t (True)
    • options: t, f (True, False)
  • --render or -r to render the model
    • default: t (True)
    • options: t, f (True, False)
  • --seed or -s to specify the seed. If rendering without training, you need to set the seed of the trained model
    • default: 0
  • --verbose or -v to print the progress of the training
    • default: t (True)
    • options: t, f (True, False)
  • --help or -h to get help

For example, to run the CartPole-v1 environment without rendering the model, you can use the following command:

python binQlearning.py -r f -e CartPole

or to just render a trained model with seed 123 from the CartPole-v1 environment, you can use the following command:

python binQlearning.py -t f -e CartPole -s 123

Running experiments

Fixed episodes

To run the experiments, you can use the following command:

python run_exp.py

By default, this will run each algorithm 20 times for all the environments. The results will be saved in the results folder and the models will be saved in the models folder.

There are also some optional arguments you can use:

  • --num or -n: specify the number of times to run each algorithm for each environment
    • default: 10

Trained models

The trained models for the different environments can be found in the models folder. The models are saved as .pkl files and can be loaded using the pickle library in Python.

To run the expriment with trained models, you can use the following command:

python run_after_train.py

Data for this expriment can be found in the results-after-train folder.

Test k bins

To test the k bins, you can use the following command:

python test_k_bins.py

Data for this expriment can be found in k_bins_result-<bin>.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •