Entropy Modules¶
Entropy modules are an extension to the SAC algorithm that are used for automatic tuning.
The layout of the modules are identical but their underlying functionality differs to handle their respective use cases.
The only method differences are the required parameters for the compute_loss
method.
Entropy modules are a wrapper over the top of PyTorch functionality and are made up of the following components:
Attribute | Description | PyTorch Item |
---|---|---|
target |
The target entropy value. | float or torch.Tensor |
log_alpha |
A tunable parameter. | torch.nn.Parameter |
alpha |
The current entropy coefficient. | torch.Tensor |
optim |
The entropy optimizer. | torch.optim.Optimizer |
Discrete¶
For discrete
action spaces, we use the EntropyModuleDiscrete
class.
This accepts the following parameters:
Parameter | Description | Default |
---|---|---|
action_dim |
The dimension of the action space. | - |
initial_alpha |
The starting entropy coefficient value. | 1.0 |
optim |
The PyTorch optimizer. | torch.optim.Adam |
lr |
The optimizer learning rate. | 0.0003 |
device |
The device to perform computations on. E.g., cpu or cuda:0 . |
None |
Continuous¶
For continuous
action spaces, we use the EntropyModule
class.
The parameters are the same as the EntropyModuleDiscrete
class.
Compute Loss¶
To compute the module loss, we use the compute_loss
method:
Python | |
---|---|
1 2 3 4 5 |
|
Updating Gradients¶
To update the gradients, we use the gradient_step
method:
Python | |
---|---|
1 |
|
Config¶
To quickly get an overview of the modules parameters we can use the config
method:
Python | |
---|---|
1 |
|
This provides us with an EntropyParameters
config model containing details about the module.
Other modules have their own respective config models that are obtained using their attribute module.config
instead.
Next, we'll dive into working with static backbones
that Velora offers 👋.