Agent Basics¶
RL agents are at the heart of Velora's framework and are the fastest way to get started with experiments.
Each agent has subtle differences but we've designed them to act like drop-in replacements of each other. Their underlying functionality may differ, but their core API is identical with the exception of optional hyperparameters.
At it's core, you have three main operations:
- Initialize - creation of the model.
- Training - running the model on an environment.
- Prediction - making predictions on unseen data.
Creating Agents¶
Creating a agent is really easy! We simply declare our agent from the velora.models
API and create a class instance.
Each model requires three main parameters:
env_id
- the Gymnasium environment ID. E.g.,CartPole-v1
orInvertedPendulum-v5
.actor_neurons
- the number of decision/hidden nodes for the Actor network (e.g.,20
or40
).critic_neurons
- the number of decision/hidden nodes for the Critic networks. We recommend this to be higher (128
or256
) than the Actor network.
And that's it! Here's an example:
Python | |
---|---|
1 2 3 |
|
This code should work 'as is'.
Want to use a different agent? Just swap out NeuroFlowCT
with a different one!
Python | |
---|---|
1 2 3 |
|
It really is that easy! 🤩
Agent Parameters
Each agent comes with a set of optional parameters that can be customized.
You can read more about them in the Agents
documentation section.
Training an Agent¶
Training an agent is just as easy!
We use the train
method, supply it with a batch_size
and boom 💥, your agent will start training for 1000
episodes:
Python | |
---|---|
1 2 3 4 |
|
This code should work 'as is'.
Want to change the number of episodes? Use the n_episodes
parameter! What about the console logged training status frequency? Use the display_count
parameter!
Python | |
---|---|
1 |
|
These are only two of the optional parameters for the train()
method. Another worth mentioning is callbacks
but we'll talk about them later.
Like before, need a different agent? Just swap it out!
Python | |
---|---|
1 2 3 4 |
|
The train()
method will create a SQLite [] database called metrics.db
in your local directory. This contains useful metrics that can be plotted to visualize the whole training process. How you use them is up to you!
We personally use and recommend a cloud-based solution (see the Analytics Callbacks section) which uses these metrics automatically.
However, we've included this offline method separately just in case you prefer it! 😉 We'll talk more about these metrics later in the Training Metrics section.
Making Predictions¶
For new predictions, we use the predict()
method. This requires two parameters:
state
- the item to make a prediction on. Must be atorch.Tensor
.hidden
- the agent's hidden state.
Liquid Neural Networks are a recurrent architecture so a hidden state is required!
By default, we set hidden
to None
, so you don't need to provide it for a single prediction:
Python | |
---|---|
1 2 3 4 5 6 7 8 9 |
|
This code should work 'as is'.
Here, we get back an action
prediction and an updated hidden
state.
Things become slightly more complicated with multiple predictions because we need to feed the hidden
state back into the predict()
method like so:
Python | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
This code should work 'as is'.
Just like before, if you want to use a different agent, just swap out NeuroFlowCT
with another one. Glorious isn't it? 😉
That covers the basics! Next, we'll move onto callbacks
. See you there! 👋