CEBRA model#
- class cebra.CEBRA(model_architecture='offset1-model', device='cuda_if_available', criterion='infonce', distance='cosine', conditional=None, temperature=1.0, temperature_mode='constant', min_temperature=0.1, time_offsets=1, delta=None, max_iterations=10000, max_adapt_iterations=500, batch_size=None, learning_rate=0.0003, optimizer='adam', output_dimension=8, verbose=False, num_hidden_units=32, pad_before_transform=True, hybrid=False, optimizer_kwargs=(('betas', (0.9, 0.999)), ('eps', 1e-08), ('weight_decay', 0), ('amsgrad', False)))#
Bases:
TransformerMixin
,BaseEstimator
CEBRA model defined as part of a
scikit-learn
-like API.- model_architecture#
The architecture of the neural network model trained with contrastive learning to encode the data. We provide a list of pre-defined models which can be displayed by running
cebra.models.get_options()
. The user can also register their own custom models (see in Docs).Default:offset1-model
- Type:
- device#
The device used for computing. Choose from
cpu
,cuda
,cuda_if_available'
or a particular GPU viacuda:0
.Default:cuda_if_available
- Type:
- criterion#
The training objective. Currently only the default
InfoNCE
is supported. The InfoNCE loss is specifically designed for contrastive learning.Default:InfoNCE
- Type:
- distance#
The distance function used in the training objective to define the positive and negative samples with respect to the reference samples. Currently supports
cosine
andeuclidean
distances,cosine
being specifically adapted for contrastive learning.Default:cosine
- Type:
- conditional#
The conditional distribution to use to sample the positive samples. Reference and negative samples are drawn from a uniform prior. For positive samples, it currently supports 3 types of distributions:
time_delta
,time
anddelta
. Positive sample are distributed around the reference samples using either time information (time
) with a fixedtime_offset
from the reference samples’ time steps, or the auxililary variables, considering the empirical distribution of how behavior vary acrosstime_offset
timesteps (time_delta
). Alternatively (delta
), the distribution is set as a Gaussian distribution, parametrized by a fixeddelta
around the reference sample.Default:None
- Type:
- temperature#
Factor by which to scale the similarity of the positive and negative pairs in the InfoNCE loss. Higher values yield “sharper”, more concentrated embeddings.
Default:1.0
- Type:
- temperature_mode#
The
constant
mode uses a temperature set by the user, constant during training. Theauto
mode trains the temperature alongside the model. If set toauto
. In that case, make sure to also setmin_temperature
to a value in the expected range (for that a simple grid-search overtemperature
can be run). Note that theauto
mode is an experimental feature for now.Default:constant
- Type:
- min_temperature#
The minimum temperature to maintain in case the temperature is optimized with the model (when setting
temperature_mode
toauto
). This parameter will be ignored iftemperature_mode
is set toconstant
. SelectNone
if no constraint should be applied.Default:0.1
- Type:
- time_offsets#
The offsets for building the empirical distribution within the chosen sampler. It can be a single value, or a tuple of values to sample from uniformly. Will only have an effect if
conditional
is set totime
ortime_delta
.- Type:
- max_iterations#
The number of iterations to train for. To pick the optimal number of iterations, start with a lower number (like 1,000) for faster training, and observe the value of the loss function (see
plot_loss()
to display the model loss over training). Make sure to pick a number of iterations high enough for the loss to converge.Default:10000
.- Type:
- max_adapt_iterations#
The number of samples for retraining the first layer when adapting the model to a new dataset. This parameter is only relevant when
adapt=True
incebra.CEBRA.fit()
.Default:500
.- Type:
- batch_size#
The batch size to use for training. If RAM or GPU memory allows, this parameter can be set to
None
to select batch gradient descent on the whole dataset. If you use mini-batch training, you should aim for a value greater than 512. Higher values typically get better results and smoother loss curves.Default:None
.- Type:
- learning_rate#
The learning rate for optimization. Higher learning rates can yield faster convergence, but also lead to instability. Tune this parameter along with :py:attr:~.temperature`. For stable training with lower temperatures, it can make sense to lower the learning rate, and train a bit longer.
Default:0.0003
.- Type:
- optimizer#
The optimizer to use. Refer to
torch.optim
for all possible optimizers. Right now, onlyadam
is supported.Default:adam
.- Type:
- output_dimension#
The output dimensionality of the embedding. For visualization purposes, this can be set to 3 for an embedding based on the cosine distance and 2-3 for an embedding based on the Eulidean distance (see
distance
). Alternatively, fit an embedding with a higher output dimensionality and then perform a linear ICA on top to visualize individual components.Default:8
.- Type:
- verbose#
If
True
, show a progress bar during training.Default:False
.- Type:
The number of dimensions to use within the neural network model. Higher numbers slow down training, but make the model more expressive and can result in a better embedding. Especially if you find that the embeddings are not consistent across runs, increase
num_hidden_units
andoutput_dimension
to increase the model size and output dimensionality.Default:32
.- Type:
- pad_before_transform#
If
False
, the output sequence will be smaller than the input sequence due to the receptive field of the model. For example, if the input sequence is100
steps long, and a model with receptive field10
is used, the output sequence will only be100-10+1
steps long. For typical use cases, this parameters can be left at the default.Default:True
.- Type:
- hybrid#
If
True
, the model will be trained using both the time-contrastive and the selected behavior-constrastive loss functions.Default:False
.- Type:
- optimizer_kwargs#
Additional optimization parameters. These have the form
((key, value), (key, value))
and are passed to the PyTorch optimizer specified through theoptimizer
argument. Refer to the optimizer documentation intorch.optim
for further information on how to format the arguments.Default:(('betas', (0.9, 0.999)), ('eps', 1e-08), ('weight_decay', 0), ('amsgrad', False))
- Type:
Example
>>> import cebra >>> cebra_model = cebra.CEBRA(model_architecture='offset10-model', ... batch_size=512, ... learning_rate=3e-4, ... temperature=1, ... output_dimension=3, ... max_iterations=10, ... distance='cosine', ... conditional='time_delta', ... device='cuda_if_available', ... verbose=True, ... time_offsets = 10)
- classmethod supported_model_architectures(pattern='*')#
Get a list of supported model architectures.
These values can be directly passed to the
model_architecture
argument.- Parameters:
pattern (
str
) – Optional pattern for filtering the architecture list. Should use thefnmatch
patterns.- Return type:
- Returns:
A list of all supported model architectures.
Note
It is always possible to use the additional model architectures given by
cebra.models.get_options()
via the CEBRA pytorch API.
- partial_fit(X, *y, callback=None, callback_frequency=None)#
Partially fit the estimator to the given dataset.
It is useful when the whole dataset is too big to fit in memory at once.
Note
The method allows to perform incremental learning from batch instance. Using
partial_fit()
on a partially fitted model will iteratively continue training, over the partially fitted parameters. To reset the parameters at each new fitting,fit()
must be used.- Parameters:
X (
Union
[ndarray
[tuple
[int
,...
],dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]],Tensor
]) – A 2D data matrix.y – An arbitrary amount of continuous indices passed as 2D matrices, and up to one discrete index passed as a 1D array. Each index has to match the length of
X
.callback (
Optional
[Callable
[[int
,Solver
],None
]]) – If a function is passed here with signaturecallback(num_steps, solver)
, the function will be regularly called at the specifiedcallback_frequency
.callback_frequency (
Optional
[int
]) – Specify the number of iterations that need to pass before triggering the specifiedcallback
.
- Return type:
- Returns:
self
, to allow chaining of operations.
Example
>>> import cebra >>> import numpy as np >>> dataset = np.random.uniform(0, 1, (1000, 30)) >>> cebra_model = cebra.CEBRA(max_iterations=10) >>> cebra_model.partial_fit(dataset) CEBRA(max_iterations=10)
- fit(X, *y, adapt=False, callback=None, callback_frequency=None)#
Fit the estimator to the given dataset, either by initializing a new model or by adapting the existing trained model.
Note
Re-fitting a fitted model with
fit()
will reset the parameters and number of iterations. To continue fitting from the previous fit,partial_fit()
must be used.Tip
We recommend saving the model, using
cebra.CEBRA.save()
, before adapting it to a different dataset (settingadapt=True
) as the adapted model will replace the previous model incebra_model.state_dict_
.- Parameters:
y – An arbitrary amount of continuous indices passed as 2D matrices, and up to one discrete index passed as a 1D array. Each index has to match the length of
X
.adapt (
bool
) – If True, the estimator will be adapted to the given data. This parameter is of use only once the estimator has been fitted at least once (i.e.,cebra.CEBRA.fit()
has been called already). Note that it can be used on a fitted model that was saved and reloaded, usingcebra.CEBRA.save()
andcebra.CEBRA.load()
. To adapt the model, the first layer of the model is reset so that it corresponds to the new features dimension. The parameters for all other layers are fixed and the first reinitialized layer is re-trained forcebra.CEBRA.max_adapt_iterations
.callback (
Optional
[Callable
[[int
,Solver
],None
]]) – If a function is passed here with signaturecallback(num_steps, solver)
, the function will be regularly called at the specifiedcallback_frequency
.callback_frequency (
Optional
[int
]) – Specify the number of iterations that need to pass before triggering the specifiedcallback
,
- Return type:
- Returns:
self
, to allow chaining of operations.
Example
>>> import cebra >>> import numpy as np >>> import tempfile >>> from pathlib import Path >>> tmp_file = Path(tempfile.gettempdir(), 'cebra.pt') >>> dataset = np.random.uniform(0, 1, (1000, 20)) >>> dataset2 = np.random.uniform(0, 1, (1000, 40)) >>> cebra_model = cebra.CEBRA(max_iterations=10) >>> cebra_model.fit(dataset) CEBRA(max_iterations=10) >>> cebra_model.save(tmp_file) >>> cebra_model.fit(dataset2, adapt=True) CEBRA(max_iterations=10) >>> tmp_file.unlink()
- transform(X, session_id=None)#
Transform an input sequence and return the embedding.
- Parameters:
- Return type:
ndarray
[tuple
[int
,...
],dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]]- Returns:
A
numpy.array()
of sizetime x output_dimension
.
Example
>>> import cebra >>> import numpy as np >>> dataset = np.random.uniform(0, 1, (1000, 30)) >>> cebra_model = cebra.CEBRA(max_iterations=10) >>> cebra_model.fit(dataset) CEBRA(max_iterations=10) >>> embedding = cebra_model.transform(dataset)
- fit_transform(X, *y, adapt=False, callback=None, callback_frequency=None)#
Composition of
fit()
andtransform()
.- Parameters:
X (
Union
[ndarray
[tuple
[int
,...
],dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]],Tensor
]) – A 2D data matrix.y – An arbitrary amount of continuous indices passed as 2D matrices, and up to one discrete index passed as a 1D array. Each index has to match the length of
X
.adapt (
bool
) – If True, the estimator will be adapted to the given data. This parameter is of use only once the estimator has been fitted at least once (i.e.,cebra.CEBRA.fit()
has been called already). Note that it can be used on a fitted model that was saved and reloaded, usingcebra.CEBRA.save()
andcebra.CEBRA.load()
.callback (
Optional
[Callable
[[int
,Solver
],None
]]) – If a function is passed here with signaturecallback(num_steps, solver)
, the function will be regularly called at the specifiedcallback_frequency
.callback_frequency (
Optional
[int
]) – Specify the number of iterations that need to pass before triggering the specifiedcallback
,
- Return type:
ndarray
[tuple
[int
,...
],dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]]- Returns:
A
numpy.array()
of sizetime x output_dimension
.
Example
>>> import cebra >>> import numpy as np >>> dataset = np.random.uniform(0, 1, (1000, 30)) >>> cebra_model = cebra.CEBRA(max_iterations=10) >>> embedding = cebra_model.fit_transform(dataset)
- save(filename, backend='sklearn')#
Save the model to disk.
- Parameters:
- Returns:
The saved model checkpoint.
Note
The save/load functionalities may change in a future version.
- File Format:
The saved model checkpoint file format depends on the specified backend.
- “sklearn” backend (default):
The model is saved in a PyTorch-compatible format using torch.save. The saved checkpoint is a dictionary containing the following elements: - ‘args’: A dictionary of parameters used to initialize the CEBRA model. - ‘state’: The state of the CEBRA model, which includes various internal attributes. - ‘state_dict’: The state dictionary of the underlying solver used by CEBRA. - ‘metadata’: Additional metadata about the saved model, including the backend used and the version of CEBRA PyTorch, NumPy and scikit-learn.
- “torch” backend:
The model is directly saved using torch.save with no additional information. The saved file contains the entire CEBRA model state.
Example
>>> import cebra >>> import numpy as np >>> import tempfile >>> from pathlib import Path >>> tmp_file = Path(tempfile.gettempdir(), 'test.jl') >>> dataset = np.random.uniform(0, 1, (1000, 30)) >>> cebra_model = cebra.CEBRA(max_iterations=10) >>> cebra_model.fit(dataset) CEBRA(max_iterations=10) >>> cebra_model.save(tmp_file) >>> tmp_file.unlink()
- classmethod load(filename, backend='auto', weights_only=None, **kwargs)#
Load a model from disk.
- Parameters:
filename (
str
) – The path to the file in which to save the trained model.backend (
Literal
[‘auto’, ‘sklearn’, ‘torch’]) – A string identifying the used backend.weights_only (
Optional
[bool
]) – Indicates whether unpickler should be restricted to loading only tensors, primitive types, dictionaries and any types added viatorch.serialization.add_safe_globals()
. Seetorch.load()
withweights_only=True
for more details. It it recommended to leave this at the default value ofNone
, which sets the argument toFalse
for torch<2.6, andTrue
for higher versions of torch. If you experience issues with loading custom models (specified outside of the CEBRA package), you can try to set this toFalse
if you trust the source of the model.kwargs – Optional keyword arguments passed directly to the loader.
- Return type:
- Returns:
The model to load.
Note
Experimental functionality. Do not expect the save/load functionalities to be backward compatible yet between CEBRA versions!
For information about the file format please refer to
cebra.CEBRA.save()
.Example
>>> import cebra >>> import numpy as np >>> import tempfile >>> from pathlib import Path >>> tmp_file = Path(tempfile.gettempdir(), 'cebra.pt') >>> dataset = np.random.uniform(0, 1, (1000, 20)) >>> cebra_model = cebra.CEBRA(max_iterations=10) >>> cebra_model.fit(dataset) CEBRA(max_iterations=10) >>> cebra_model.save(tmp_file) >>> loaded_model = cebra.CEBRA.load(tmp_file) >>> embedding = loaded_model.transform(dataset) >>> tmp_file.unlink()
- to(device)#
Moves the cebra model to the specified device.
- Parameters:
device (
Union
[str
,device
]) – The device to move the cebra model to. This can be a string representing the device (‘cpu’,’cuda’, cuda:device_id, or ‘mps’) or a torch.device object.- Returns:
The cebra model instance.
Example
>>> import cebra >>> import numpy as np >>> dataset = np.random.uniform(0, 1, (1000, 30)) >>> cebra_model = cebra.CEBRA(max_iterations=10, device = "cuda_if_available") >>> cebra_model.fit(dataset) CEBRA(max_iterations=10) >>> cebra_model = cebra_model.to("cpu")