Continuous Thought Machines are a new architecture where Temporal Dynamics are put at the forefront of the training process.


Intuition

Neuronal Synchronization

Given that we want the model’s relationship with data to not only depend on a single snapshot of the network state, but to also build complex temporal dynamics through which time dependency can emerge as a variable in the latent space.

We do this by first collecting all the post activations into a β€œpost activation history”:

It is important to note that the size of is not fixed, and is instead dependent on the current tick .

We now define neuronal synchronization as the matrix yielded by the inner dot product between post-activation histories:

However, it is important to note that this operation scales very poorly( ), so instead we will use random sampling to choose these pairs. We define these as and as pairs from , yielding two synchronization representation pairs:

We then project onto an output space as:

We can then compute neuronal synchronization by first getting the matrices from the

Architecture

Synapse Model

Neuron Level Models

Synchronisation