Skip to content

velora.models.config

Config models for storing agent details.

BufferConfig

Bases: BaseModel

A config model for buffer details.

Attributes:

Name Type Description
type Literal['ReplayBuffer', 'RolloutBuffer']

the type of buffer

capacity int

the maximum capacity of the buffer

state_dim int

dimension of state observations

action_dim int

dimension of actions

hidden_dim int

dimension of hidden state

Source code in velora/models/config.py
Python
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class BufferConfig(BaseModel):
    """
    A config model for buffer details.

    Attributes:
        type: the type of buffer
        capacity: the maximum capacity of the buffer
        state_dim: dimension of state observations
        action_dim: dimension of actions
        hidden_dim: dimension of hidden state
    """

    type: Literal["ReplayBuffer", "RolloutBuffer"]
    capacity: int
    state_dim: int
    action_dim: int
    hidden_dim: int

CriticConfig

Bases: BaseModel

A critic config model for storing a NeuroFlow agent's critic module details.

Attributes:

Name Type Description
critic1 ModuleConfig

details about the first critic network

critic2 ModuleConfig

details about the second critic network

Source code in velora/models/config.py
Python
135
136
137
138
139
140
141
142
143
144
145
class CriticConfig(BaseModel):
    """
    A critic config model for storing a NeuroFlow agent's critic module details.

    Attributes:
        critic1: details about the first critic network
        critic2: details about the second critic network
    """

    critic1: ModuleConfig
    critic2: ModuleConfig

CuriosityConfig

Bases: BaseModel

A config model for the Intrinsic Curiosity Module (ICM).

Attributes:

Name Type Description
icm ModuleConfig

details about the ICM

lr float

the optimizers learning rate

eta float

importance scaling factor for intrinsic reward

beta float

weight balancing for inverse vs. forward model

Source code in velora/models/config.py
Python
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
class CuriosityConfig(BaseModel):
    """
    A config model for the Intrinsic Curiosity Module (ICM).

    Attributes:
        icm: details about the ICM
        lr: the optimizers learning rate
        eta: importance scaling factor for intrinsic reward
        beta: weight balancing for inverse vs. forward model
    """

    icm: ModuleConfig
    lr: float
    eta: float
    beta: float

EntropyParameters

Bases: BaseModel

A config model for extra parameters for NeuroFlow agents.

Attributes:

Name Type Description
lr float

the entropy parameter learning rate

initial_alpha float

the starting entropy coefficient value

target float

the target entropy for automatic adjustment

Source code in velora/models/config.py
Python
103
104
105
106
107
108
109
110
111
112
113
114
115
class EntropyParameters(BaseModel):
    """
    A config model for extra parameters for NeuroFlow agents.

    Attributes:
        lr: the entropy parameter learning rate
        initial_alpha: the starting entropy coefficient value
        target: the target entropy for automatic adjustment
    """

    lr: float
    initial_alpha: float
    target: float

ModelDetails

Bases: BaseModel

A config model for storing an agent's network model details.

Attributes:

Name Type Description
type str

the type of architecture used. Default is actor-critic

state_dim int

number of input features

actor_neurons int

number of actor network decision nodes

critic_neurons int

number of critic network decision nodes

action_dim int

number of output features

action_type Literal['discrete', 'continuous']

the type of action space. Default is continuous

tau float

the soft update factor for target networks

gamma float

the reward discount factor

target_networks bool

whether the agent uses target networks or not. Default is True

log_std Tuple[float, float] | None

lower and upper bounds for the log standard deviation of the action distribution. Only required for continuous spaces. Default is None

exploration_type Literal['Entropy', 'CAT-Entropy']

the type of agent exploration used

actor ModuleConfig

details about the Actor network

critic CriticConfig

details about the Critic networks

entropy EntropyParameters

details about the entropy exploration

Source code in velora/models/config.py
Python
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
class ModelDetails(BaseModel):
    """
    A config model for storing an agent's network model details.

    Attributes:
        type: the type of architecture used. Default is `actor-critic`
        state_dim: number of input features
        actor_neurons: number of actor network decision nodes
        critic_neurons: number of critic network decision nodes
        action_dim: number of output features
        action_type: the type of action space. Default is `continuous`
        tau: the soft update factor for target networks
        gamma: the reward discount factor
        target_networks: whether the agent uses target networks or not.
            Default is `True`
        log_std: lower and upper bounds for the log standard deviation of the
            action distribution. Only required for `continuous` spaces.
            Default is `None`
        exploration_type: the type of agent exploration used
        actor: details about the Actor network
        critic: details about the Critic networks
        entropy: details about the entropy exploration
    """

    type: str = "actor-critic"
    state_dim: int
    actor_neurons: int
    critic_neurons: int
    action_dim: int
    tau: float
    gamma: float
    action_type: Literal["discrete", "continuous"] = "continuous"
    target_networks: bool = True
    log_std: Tuple[float, float] | None = None
    exploration_type: Literal["Entropy", "CAT-Entropy"]
    actor: ModuleConfig
    critic: CriticConfig
    entropy: EntropyParameters

ModuleConfig

Bases: BaseModel

A config model for a module's details.

Attributes:

Name Type Description
active_params int

active module parameters count

total_params int

total module parameter count

architecture Dict[str, Any]

a summary of the module's architecture

Source code in velora/models/config.py
Python
67
68
69
70
71
72
73
74
75
76
77
78
79
class ModuleConfig(BaseModel):
    """
    A config model for a module's details.

    Attributes:
        active_params: active module parameters count
        total_params: total module parameter count
        architecture: a summary of the module's architecture
    """

    active_params: int
    total_params: int
    architecture: Dict[str, Any]

RLAgentConfig

Bases: BaseModel

A config model for NeuroFlow agents. Stored with agent states during the save() method.

Attributes:

Name Type Description
agent str

the type of agent used

env str

the Gymnasium environment ID the model was trained on

seed int

random number generator value

model_details ModelDetails

the agent's network model details

buffer BufferConfig

the buffer details

torch TorchConfig

the PyTorch details

train_params TrainConfig | None

the agents training parameters. Default is None

Source code in velora/models/config.py
Python
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
class RLAgentConfig(BaseModel):
    """
    A config model for NeuroFlow agents. Stored with agent states during the
    `save()` method.

    Attributes:
        agent: the type of agent used
        env: the Gymnasium environment ID the model was trained on
        seed: random number generator value
        model_details: the agent's network model details
        buffer: the buffer details
        torch: the PyTorch details
        train_params: the agents training parameters. Default is `None`
    """

    agent: str
    env: str
    seed: int
    model_details: ModelDetails
    buffer: BufferConfig
    torch: TorchConfig
    train_params: TrainConfig | None = None

    def update(self, train_params: TrainConfig) -> Self:
        """
        Updates the training details of the model.

        Parameters:
            train_params (TrainConfig): a config containing training parameters

        Returns:
            self (Self): a new config model with the updated values.
        """
        return RLAgentConfig(
            train_params=train_params,
            **self.model_dump(exclude={"train_params"}),
        )

update(train_params)

Updates the training details of the model.

Parameters:

Name Type Description Default
train_params TrainConfig

a config containing training parameters

required

Returns:

Name Type Description
self Self

a new config model with the updated values.

Source code in velora/models/config.py
Python
211
212
213
214
215
216
217
218
219
220
221
222
223
224
def update(self, train_params: TrainConfig) -> Self:
    """
    Updates the training details of the model.

    Parameters:
        train_params (TrainConfig): a config containing training parameters

    Returns:
        self (Self): a new config model with the updated values.
    """
    return RLAgentConfig(
        train_params=train_params,
        **self.model_dump(exclude={"train_params"}),
    )

SACExtraParameters

Bases: BaseModel

A config model for extra parameters for the Soft Actor-Critic (SAC) agent.

Attributes:

Name Type Description
alpha_lr float

the entropy parameter learning rate

initial_alpha float

the starting entropy coefficient value

target_entropy float

the target entropy for automatic adjustment

log_std_min float | None

lower bound for the log standard deviation of the action distribution. Default is None

log_std_max float | None

upper bound for the log standard deviation of the action distribution. Default is None

Source code in velora/models/config.py
Python
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
class SACExtraParameters(BaseModel):
    """
    A config model for extra parameters for the Soft Actor-Critic (SAC) agent.

    Attributes:
        alpha_lr: the entropy parameter learning rate
        initial_alpha: the starting entropy coefficient value
        target_entropy: the target entropy for automatic adjustment
        log_std_min: lower bound for the log standard deviation of the
            action distribution. Default is `None`
        log_std_max: upper bound for the log standard deviation of the
            action distribution. Default is `None`
    """

    alpha_lr: float
    initial_alpha: float
    target_entropy: float
    log_std_min: float | None = None
    log_std_max: float | None = None

TorchConfig

Bases: BaseModel

A config model for PyTorch details.

Attributes:

Name Type Description
device str

the device used to train the model

optimizer str

the type of optimizer used

loss str

the type of optimizer used

Source code in velora/models/config.py
Python
25
26
27
28
29
30
31
32
33
34
35
36
37
class TorchConfig(BaseModel):
    """
    A config model for PyTorch details.

    Attributes:
        device: the device used to train the model
        optimizer: the type of optimizer used
        loss: the type of optimizer used
    """

    device: str
    optimizer: str
    loss: str

TrainConfig

Bases: BaseModel

A config model for training parameter details.

Attributes:

Name Type Description
batch_size int

the size of the training batch

n_episodes int

the total number of episodes trained for. Default is None

window_size int

reward moving average size (in episodes)

display_count int

console training progress frequency (in episodes)

log_freq int

metric logging frequency (in episodes)

callbacks Dict[str, Any] | None

a dictionary of callback details. Default is None

max_steps int

the maximum number of steps per training episode. Default is None

warmup_steps int

number of random steps to take before starting training

Source code in velora/models/config.py
Python
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
class TrainConfig(BaseModel):
    """
    A config model for training parameter details.

    Attributes:
        batch_size: the size of the training batch
        n_episodes: the total number of episodes trained for. Default is `None`
        window_size: reward moving average size (in episodes)
        display_count: console training progress frequency (in episodes)
        log_freq: metric logging frequency (in episodes)
        callbacks: a dictionary of callback details. Default is `None`
        max_steps: the maximum number of steps per training episode.
            Default is `None`
        warmup_steps: number of random steps to take before starting
            training
    """

    batch_size: int
    n_episodes: int
    window_size: int
    display_count: int
    log_freq: int
    callbacks: Dict[str, Any] | None = None
    max_steps: int
    warmup_steps: int