CEBRA model#

class cebra.CEBRA(model_architecture='offset1-model', device='cuda_if_available', criterion='infonce', distance='cosine', conditional=None, temperature=1.0, temperature_mode='constant', min_temperature=0.1, time_offsets=1, delta=None, max_iterations=10000, max_adapt_iterations=500, batch_size=None, learning_rate=0.0003, optimizer='adam', output_dimension=8, verbose=False, num_hidden_units=32, pad_before_transform=True, hybrid=False, optimizer_kwargs=(('betas', (0.9, 0.999)), ('eps', 1e-08), ('weight_decay', 0), ('amsgrad', False)), masking_kwargs=None)#

Bases: TransformerMixin, BaseEstimator

CEBRA model defined as part of a scikit-learn-like API.

model_architecture#

The architecture of the neural network model trained with contrastive learning to encode the data. We provide a list of pre-defined models which can be displayed by running cebra.models.get_options(). The user can also register their own custom models (see in Docs).

Default: offset1-model

Type:: str

device#

The device used for computing. Choose from cpu, cuda, cuda_if_available' or a particular GPU via cuda:0.

Default: cuda_if_available

Type:: str

criterion#

The training objective. Currently only the default InfoNCE is supported. The InfoNCE loss is specifically designed for contrastive learning.

Default: InfoNCE

Type:: str

distance#

The distance function used in the training objective to define the positive and negative samples with respect to the reference samples. Currently supports cosine and euclidean distances, cosine being specifically adapted for contrastive learning.

Default: cosine

Type:: str

conditional#

The conditional distribution to use to sample the positive samples. Reference and negative samples are drawn from a uniform prior. For positive samples, it currently supports 3 types of distributions: time_delta, time and delta. Positive sample are distributed around the reference samples using either time information (time) with a fixed time_offset from the reference samples’ time steps, or the auxililary variables, considering the empirical distribution of how behavior vary across time_offset timesteps (time_delta). Alternatively (delta), the distribution is set as a Gaussian distribution, parametrized by a fixed delta around the reference sample.

Default: None

Type:: str

temperature#

Factor by which to scale the similarity of the positive and negative pairs in the InfoNCE loss. Higher values yield “sharper”, more concentrated embeddings.

Default: 1.0

Type:: float

temperature_mode#

The constant mode uses a temperature set by the user, constant during training. The auto mode trains the temperature alongside the model. If set to auto. In that case, make sure to also set min_temperature to a value in the expected range (for that a simple grid-search over temperature can be run). Note that the auto mode is an experimental feature for now.

Default: constant

Type:: str

min_temperature#

The minimum temperature to maintain in case the temperature is optimized with the model (when setting temperature_mode to auto). This parameter will be ignored if temperature_mode is set to constant. Select None if no constraint should be applied.

Default: 0.1

Type:: float

time_offsets#

The offsets for building the empirical distribution within the chosen sampler. It can be a single value, or a tuple of values to sample from uniformly. Will only have an effect if conditional is set to time or time_delta.

Type:: int

max_iterations#

The number of iterations to train for. To pick the optimal number of iterations, start with a lower number (like 1,000) for faster training, and observe the value of the loss function (see plot_loss() to display the model loss over training). Make sure to pick a number of iterations high enough for the loss to converge.

Default: 10000.

Type:: int

max_adapt_iterations#

The number of samples for retraining the first layer when adapting the model to a new dataset. This parameter is only relevant when adapt=True in cebra.CEBRA.fit().

Default: 500.

Type:: int

batch_size#

The batch size to use for training. If RAM or GPU memory allows, this parameter can be set to None to select batch gradient descent on the whole dataset. If you use mini-batch training, you should aim for a value greater than 512. Higher values typically get better results and smoother loss curves.

Default: None.

Type:: int

learning_rate#

The learning rate for optimization. Higher learning rates can yield faster convergence, but also lead to instability. Tune this parameter along with :py:attr:~.temperature`. For stable training with lower temperatures, it can make sense to lower the learning rate, and train a bit longer.

Default: 0.0003.

Type:: float

optimizer#

The optimizer to use. Refer to torch.optim for all possible optimizers. Right now, only adam is supported.

Default: adam.

Type:: str

output_dimension#

The output dimensionality of the embedding. For visualization purposes, this can be set to 3 for an embedding based on the cosine distance and 2-3 for an embedding based on the Eulidean distance (see distance). Alternatively, fit an embedding with a higher output dimensionality and then perform a linear ICA on top to visualize individual components.

Default: 8.

Type:: int

verbose#

If True, show a progress bar during training.

Default: False.

Type:: bool

num_hidden_units#

The number of dimensions to use within the neural network model. Higher numbers slow down training, but make the model more expressive and can result in a better embedding. Especially if you find that the embeddings are not consistent across runs, increase num_hidden_units and output_dimension to increase the model size and output dimensionality.

Default: 32.

Type:: int

pad_before_transform#

If False, the output sequence will be smaller than the input sequence due to the receptive field of the model. For example, if the input sequence is 100 steps long, and a model with receptive field 10 is used, the output sequence will only be 100-10+1 steps long. For typical use cases, this parameters can be left at the default.

Default: True.

Type:: bool

hybrid#

If True, the model will be trained using both the time-contrastive and the selected behavior-constrastive loss functions.

Default: False.

Type:: bool

optimizer_kwargs#

Additional optimization parameters. These have the form ((key, value), (key, value)) and are passed to the PyTorch optimizer specified through the optimizer argument. Refer to the optimizer documentation in torch.optim for further information on how to format the arguments.

Default: (('betas', (0.9, 0.999)), ('eps', 1e-08), ('weight_decay', 0), ('amsgrad', False))

Type:: tuple

masking_kwargs#

A Tuple of masking types and their corresponding required masking values. The keys are the names of the Mask instances and formatting should be ((key, value), (key, value)).

Default: None.

Type:: tuple

Example

>>> import cebra
>>> cebra_model = cebra.CEBRA(model_architecture='offset10-model',
...                           batch_size=512,
...                           learning_rate=3e-4,
...                           temperature=1,
...                           output_dimension=3,
...                           max_iterations=10,
...                           distance='cosine',
...                           conditional='time_delta',
...                           device='cuda_if_available',
...                           verbose=True,
...                           time_offsets = 10)

classmethod supported_model_architectures(pattern='*')#

Get a list of supported model architectures.

These values can be directly passed to the model_architecture argument.

Parameters:: pattern (str) – Optional pattern for filtering the architecture list. Should use the fnmatch patterns.
Return type:: List[str]
Returns:: A list of all supported model architectures.

Note

It is always possible to use the additional model architectures given by cebra.models.get_options() via the CEBRA pytorch API.

property num_sessions: int | None#

The number of sessions.

Note

It will be None for single session.

Return type:: Optional[int]

partial_fit(X, *y, callback=None, callback_frequency=None)#

Partially fit the estimator to the given dataset.

It is useful when the whole dataset is too big to fit in memory at once.

Note

The method allows to perform incremental learning from batch instance. Using partial_fit() on a partially fitted model will iteratively continue training, over the partially fitted parameters. To reset the parameters at each new fitting, fit() must be used.

Parameters:

X (Union[ndarray[tuple[int, ...], dtype[TypeVar(_ScalarType_co, bound= generic, covariant=True)]], Tensor]) – A 2D data matrix.
y – An arbitrary amount of continuous indices passed as 2D matrices, and up to one discrete index passed as a 1D array. Each index has to match the length of X.
callback (Optional[Callable[[int, Solver], None]]) – If a function is passed here with signature callback(num_steps, solver), the function will be regularly called at the specified callback_frequency.
callback_frequency (Optional[int]) – Specify the number of iterations that need to pass before triggering the specified callback.

Return type:

CEBRA

Returns:

self, to allow chaining of operations.

Example

>>> import cebra
>>> import numpy as np
>>> dataset =  np.random.uniform(0, 1, (1000, 30))
>>> cebra_model = cebra.CEBRA(max_iterations=10)
>>> cebra_model.partial_fit(dataset)
CEBRA(max_iterations=10)

fit(X, *y, adapt=False, callback=None, callback_frequency=None)#

Fit the estimator to the given dataset, either by initializing a new model or by adapting the existing trained model.

Note

Re-fitting a fitted model with fit() will reset the parameters and number of iterations. To continue fitting from the previous fit, partial_fit() must be used.

Tip

We recommend saving the model, using cebra.CEBRA.save(), before adapting it to a different dataset (setting adapt=True) as the adapted model will replace the previous model in cebra_model.state_dict_.

Parameters:

X (Union[List[Iterable], Iterable]) – A 2D data matrix.
y – An arbitrary amount of continuous indices passed as 2D matrices, and up to one discrete index passed as a 1D array. Each index has to match the length of X.
adapt (bool) – If True, the estimator will be adapted to the given data. This parameter is of use only once the estimator has been fitted at least once (i.e., cebra.CEBRA.fit() has been called already). Note that it can be used on a fitted model that was saved and reloaded, using cebra.CEBRA.save() and cebra.CEBRA.load(). To adapt the model, the first layer of the model is reset so that it corresponds to the new features dimension. The parameters for all other layers are fixed and the first reinitialized layer is re-trained for cebra.CEBRA.max_adapt_iterations.
callback (Optional[Callable[[int, Solver], None]]) – If a function is passed here with signature callback(num_steps, solver), the function will be regularly called at the specified callback_frequency.
callback_frequency (Optional[int]) – Specify the number of iterations that need to pass before triggering the specified callback,

Return type:

CEBRA

Returns:

self, to allow chaining of operations.

Example

>>> import cebra
>>> import numpy as np
>>> import tempfile
>>> from pathlib import Path
>>> tmp_file = Path(tempfile.gettempdir(), 'cebra.pt')
>>> dataset =  np.random.uniform(0, 1, (1000, 20))
>>> dataset2 =  np.random.uniform(0, 1, (1000, 40))
>>> cebra_model = cebra.CEBRA(max_iterations=10)
>>> cebra_model.fit(dataset)
CEBRA(max_iterations=10)
>>> cebra_model.save(tmp_file)
>>> cebra_model.fit(dataset2, adapt=True)
CEBRA(max_iterations=10)
>>> tmp_file.unlink()

transform(X, batch_size=None, session_id=None)#

Transform an input sequence and return the embedding.

Parameters:

X (Union[ndarray[tuple[int, ...], dtype[TypeVar(_ScalarType_co, bound= generic, covariant=True)]], Tensor]) – A numpy array or torch tensor of size time x dimension.
batch_size (Optional[int])
session_id (Optional[int]) – The session ID, an int between 0 and num_sessions for multisession, set to None for single session.

Return type:

ndarray[tuple[int, ...], dtype[TypeVar(_ScalarType_co, bound= generic, covariant=True)]]

Returns:

A numpy.array() of size time x output_dimension.

Example

>>> import cebra
>>> import numpy as np
>>> dataset =  np.random.uniform(0, 1, (1000, 30))
>>> cebra_model = cebra.CEBRA(max_iterations=10)
>>> cebra_model.fit(dataset)
CEBRA(max_iterations=10)
>>> embedding = cebra_model.transform(dataset, batch_size=200)

fit_transform(X, *y, adapt=False, callback=None, callback_frequency=None)#

Composition of fit() and transform().

Parameters:

X (Union[ndarray[tuple[int, ...], dtype[TypeVar(_ScalarType_co, bound= generic, covariant=True)]], Tensor]) – A 2D data matrix.
y – An arbitrary amount of continuous indices passed as 2D matrices, and up to one discrete index passed as a 1D array. Each index has to match the length of X.
adapt (bool) – If True, the estimator will be adapted to the given data. This parameter is of use only once the estimator has been fitted at least once (i.e., cebra.CEBRA.fit() has been called already). Note that it can be used on a fitted model that was saved and reloaded, using cebra.CEBRA.save() and cebra.CEBRA.load().
callback (Optional[Callable[[int, Solver], None]]) – If a function is passed here with signature callback(num_steps, solver), the function will be regularly called at the specified callback_frequency.
callback_frequency (Optional[int]) – Specify the number of iterations that need to pass before triggering the specified callback,

Return type:

ndarray[tuple[int, ...], dtype[TypeVar(_ScalarType_co, bound= generic, covariant=True)]]

Returns:

A numpy.array() of size time x output_dimension.

Example

>>> import cebra
>>> import numpy as np
>>> dataset =  np.random.uniform(0, 1, (1000, 30))
>>> cebra_model = cebra.CEBRA(max_iterations=10)
>>> embedding = cebra_model.fit_transform(dataset)

save(filename, backend='sklearn')#

Save the model to disk.

Parameters:

filename (str) – The path to the file in which to save the trained model.
backend (Literal[‘torch’, ‘sklearn’]) – A string identifying the used backend. Default is “sklearn”.

Returns:

The saved model checkpoint.

Note

The save/load functionalities may change in a future version.

File Format:

The saved model checkpoint file format depends on the specified backend.

“sklearn” backend (default):: The model is saved in a PyTorch-compatible format using torch.save. The saved checkpoint is a dictionary containing the following elements: - ‘args’: A dictionary of parameters used to initialize the CEBRA model. - ‘state’: The state of the CEBRA model, which includes various internal attributes. - ‘state_dict’: The state dictionary of the underlying solver used by CEBRA. - ‘metadata’: Additional metadata about the saved model, including the backend used and the version of CEBRA PyTorch, NumPy and scikit-learn.
“torch” backend:: The model is directly saved using torch.save with no additional information. The saved file contains the entire CEBRA model state.

Example

>>> import cebra
>>> import numpy as np
>>> import tempfile
>>> from pathlib import Path
>>> tmp_file = Path(tempfile.gettempdir(), 'test.jl')
>>> dataset =  np.random.uniform(0, 1, (1000, 30))
>>> cebra_model = cebra.CEBRA(max_iterations=10)
>>> cebra_model.fit(dataset)
CEBRA(max_iterations=10)
>>> cebra_model.save(tmp_file)
>>> tmp_file.unlink()

classmethod load(filename, backend='auto', weights_only=None, **kwargs)#

Load a model from disk.

Parameters:

filename (str) – The path to the file in which to save the trained model.
backend (Literal[‘auto’, ‘sklearn’, ‘torch’]) – A string identifying the used backend.
weights_only (Optional[bool]) – Indicates whether unpickler should be restricted to loading only tensors, primitive types, dictionaries and any types added via torch.serialization.add_safe_globals(). See torch.load() with weights_only=True for more details. It it recommended to leave this at the default value of None, which sets the argument to False for torch<2.6, and True for higher versions of torch. If you experience issues with loading custom models (specified outside of the CEBRA package), you can try to set this to False if you trust the source of the model.
kwargs – Optional keyword arguments passed directly to the loader.

Return type:

CEBRA

Returns:

The model to load.

Note

Experimental functionality. Do not expect the save/load functionalities to be backward compatible yet between CEBRA versions!

For information about the file format please refer to cebra.CEBRA.save().

Example

>>> import cebra
>>> import numpy as np
>>> import tempfile
>>> from pathlib import Path
>>> tmp_file = Path(tempfile.gettempdir(), 'cebra.pt')
>>> dataset =  np.random.uniform(0, 1, (1000, 20))
>>> cebra_model = cebra.CEBRA(max_iterations=10)
>>> cebra_model.fit(dataset)
CEBRA(max_iterations=10)
>>> cebra_model.save(tmp_file)
>>> loaded_model = cebra.CEBRA.load(tmp_file)
>>> embedding = loaded_model.transform(dataset)
>>> tmp_file.unlink()

to(device)#

Moves the cebra model to the specified device.

Parameters:: device (Union[str, device]) – The device to move the cebra model to. This can be a string representing the device (‘cpu’,’cuda’, cuda:device_id, or ‘mps’) or a torch.device object.
Returns:: The cebra model instance.

Example

>>> import cebra
>>> import numpy as np
>>> dataset =  np.random.uniform(0, 1, (1000, 30))
>>> cebra_model = cebra.CEBRA(max_iterations=10, device = "cuda_if_available")
>>> cebra_model.fit(dataset)
CEBRA(max_iterations=10)
>>> cebra_model = cebra_model.to("cpu")