<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts | Hugo Ninou</title><link>https://hugoninou.netlify.app/post/</link><atom:link href="https://hugoninou.netlify.app/post/index.xml" rel="self" type="application/rss+xml"/><description>Posts</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://hugoninou.netlify.app/media/icon_hu587af1cd70838232b3bc4a96d913d2d7_45151_512x512_fill_lanczos_center_3.png</url><title>Posts</title><link>https://hugoninou.netlify.app/post/</link></image><item><title>Paper deep dive : Linking connectivity, dynamics and computations in low-rank recurrent neural networks, Mastrogiuseppe &amp; Ostojic, 2018</title><link>https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/</link><pubDate>Sun, 08 Sep 2024 15:36:22 +0000</pubDate><guid>https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/</guid><description>&lt;h2 id="table-of-contents">Table of Contents&lt;/h2>
&lt;ol>
&lt;li>&lt;a href="#introduction">Introduction&lt;/a>&lt;/li>
&lt;li>&lt;a href="#theoretical-framework">Theoretical Framework&lt;/a>
&lt;ol>
&lt;li>&lt;a href="#building-a-firing-rate-model">Building a firing rate model&lt;/a>
&lt;ol>
&lt;li>&lt;a href="#the-firing-rate">The firing rate&lt;/a>&lt;/li>
&lt;li>&lt;a href="#the-total-synaptic-current">The total synaptic current&lt;/a>&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>&lt;a href="#networks-with-low-rank-connectivity-matrices">Networks with low-rank connectivity matrices&lt;/a>
&lt;ol>
&lt;li>&lt;a href="#networks-with-unit-rank-structure">Networks with unit-rank structure&lt;/a>&lt;/li>
&lt;li>&lt;a href="#dynamical-mean-field-theory">Dynamical Mean-Field Theory&lt;/a>&lt;/li>
&lt;li>&lt;a href="#dynamical-mean-field-theory-extension-to-the-%5c%28%5ctau_r-%5cgg-%5ctau_s%5c%29-case">Dynamical Mean-Field Theory extension to the
$\tau_r \gg \tau_s$ case&lt;/a>&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>&lt;a href="#spontaneous-activity">Spontaneous Activity&lt;/a>
&lt;ol>
&lt;li>&lt;a href="#reproduction-of-the-papers-phase-diagram">Reproduction of the paper&amp;rsquo;s phase diagram&lt;/a>&lt;/li>
&lt;li>&lt;a href="#comparison-with-the-phase-diagram-in-the-%5c%28%5ctau_r-%5cgg-%5ctau_s%5c%29-case">Comparison with the phase diagram in the
$\tau_r \gg \tau_s$ case&lt;/a>&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>&lt;a href="#response-to-an-external-input">Response to an external input&lt;/a>
&lt;ol>
&lt;li>&lt;a href="#reproduction-of-figure-2D-of-the-article">Reproduction of figure 2.D of the article&lt;/a>&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>&lt;a href="#conclusion">Conclusion&lt;/a>&lt;/li>
&lt;/ol>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Cortical networks, which consist in highly interconnected neurons with recurrent synapses are believed to make for the fundamental units of mammalian brains. Observations show that cortical connectivity lies somewhere between fully structured and fully random. Several functional approaches have been made for connectivity design of cortical networks since the 80&amp;rsquo;s but they lack a unifying conceptual picture. To address this matter, authors point out that all these approaches share something in common: the fact that the resulting connectivity matrices are low rank. This article [1] aims at linking the recurrent neural networks&amp;rsquo; dynamics to their connectivity matrix and showing how one can design the low-rank connectivity structure of such networks to implement specific computations. The latter point is illustrated on four specific tasks.&lt;/p>
&lt;h2 id="theoretical-framework">Theoretical Framework&lt;/h2>
&lt;p>The model used here to describe cortical neural networks is a firing rate model. This means that each node (i.e. neuron) in the network is represented by its firing rate
$\phi(x_i)$ with
$\phi(x)=\textrm{tanh}(x)$ being the current-to-rate transfer function, and
$i\in [1\dots N]$, with
$N$ the number of neurons. The evolution of a neuron&amp;rsquo;s firing rate is governed by equation (1).&lt;/p>
$$
\dot{x_i}(t) = -x_i(t) + \sum_{j=1}^N J_{ij} \phi(x_j(t)) + I_i \quad (1)
$$
&lt;p>where
$J_{ij}$ is the connectivity matrix representing the synaptic connections of the network and
$x_i$ is the external current input to neuron
$i$. Note here that
$\phi(x_i)$ representing the firing rate can have negative values. This can be dealt with by replacing the
$\textrm{tanh}$ function, which makes the calculations easier, by a sigmoid without causing major changes to the theoretical results.&lt;/p>
&lt;p>As all the results of this article come from analysis and simulations of this model, it is crucial to understand its limitations as for the description of cortical neural networks. With this goal in mind, we explore some developments in order to understand the underlying hypotheses and limitations of this model by building it again from scratch [2].&lt;/p>
&lt;h3 id="building-a-firing-rate-model">Building a firing rate model&lt;/h3>
&lt;p>The behaviour of a neuron can be described by the neuronal response function
$y(t)$ that encodes the exact time at which it fires spikes. A model involving
$y(t)$ is called a spiking model.
$$
y(t) = \sum_{i=1}^n \delta(t-t_i), \quad r(t) = \int_t^{t+\Delta t} \underbrace{\frac{1}{\Delta t}\langle y(\tau) \rangle d\tau}_{\textrm{average over the trials}} \quad (2)
$$ &lt;/p>
&lt;p>Firing rate models focus on the quantity
$r(t)$ in eq. (2) which is an approximation of the exact spike sequence
$y(t)$. They have the advantage of being easier to simulate on computers as they do not take into account the short time scale dynamics of the spikes. As we want to model the total input for the neurons, we can look at
$r(t)$ instead of
$y(t)$ if there is not too much variability between two trials. Indeed, upon summing over different synapses, one has low variability (Central Limit Theorem) if the entries are numerous and uncorrelated.&lt;/p>
&lt;p>Firing rate models are relevant when&lt;/p>
&lt;ol>
&lt;li>The firing of neurons in a network is uncorrelated (there is little synchronous firing)&lt;/li>
&lt;li>The precise patterns of spike timing are unimportant. Indeed, the information regarding those precise patterns is lost when averaging over the trials.&lt;/li>
&lt;/ol>
&lt;h4 id="figure-1-sketch-of-a-modeled-neuron-with-presynaptic-inputs-hahahugoshortcodes19hbhb-and-postsynaptic-output-hahahugoshortcodes20hbhb-hahahugoshortcodes21hbhb-is-the-total-synaptic-current-or-input">Figure 1: Sketch of a modeled neuron with presynaptic inputs
$r_i$ and postsynaptic output
$r(t)$.
$x(t)$ is the total synaptic current or input.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Neuron drawing" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/Neuron_drawing_hu848aa20c091905cdfe472b7f9137f9f6_140548_2569209909915e28d228f5f2e1212662.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/Neuron_drawing_hu848aa20c091905cdfe472b7f9137f9f6_140548_c20d0f9442433fa705a61fbe3a92ffe1.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/Neuron_drawing_hu848aa20c091905cdfe472b7f9137f9f6_140548_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/Neuron_drawing_hu848aa20c091905cdfe472b7f9137f9f6_140548_2569209909915e28d228f5f2e1212662.webp"
width="760"
height="296"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>In order to fully describe a firing rate model, we have to specify the dependence of the postsynaptic firing rate on the total synaptic input
$r(x)$ and the dependence of the total synaptic input on the presynaptic inputs
$x(r_1,r_2,r_3,r_4)$.&lt;/p>
&lt;h4 id="the-firing-rate">The firing rate&lt;/h4>
&lt;p>Let&amp;rsquo;s first describe the firing rate
$r(t)$ as a function of the total synaptic current
$x(t)$. We could simply write
$r(t) = \phi(x(t))$,
$\phi$ being the current-to-rate function, but due to the membrane capacitance and resistance, we should rather express the firing rate
$r(t)$ as a low-pass filtered version of its steady state with characteristic time
$\tau_r$, usually of the order of 20 ms.&lt;/p>
$$
\tau_r \frac{dr}{dt} = -r + \phi(x(t))
$$
&lt;p>Note that in reality, it is the membrane potential, not the firing rate, that is a low pass of the input current and that the dynamics of the two are not the same.&lt;/p>
&lt;h4 id="the-total-synaptic-current">The total synaptic current&lt;/h4>
&lt;p>We now want to write the total synaptic current of neuron
$i$,
$x_i(t)$ as a function of the
$N$ presynaptic firing rates
$r_j(t)$ for
$j \in [1 \dots N]$ and their associated weights
$J_{ij}$. Note that
$J_{ij}>0$ corresponds to an excitatory synapse while
$J_{ij}&lt;0$ corresponds to an inhibitory one. We introduce the synaptic kernel response function
$K(t)$ that is simply the response current induced at time
$t$ by a spike at time
$t=0$. Assuming that the effects of a spike sum linearly, we can then write the total synaptic current as&lt;/p>
$$
x_i(t) = \sum_{j=1}^{N} J_{ij} \int_{-\infty}^{t} d\tau K(t-\tau) \underbrace{y_j(\tau)}_{\sum_{k=1}^{N} \delta(\tau-t_k)}
$$
&lt;p>which in the firing rate model approximation writes&lt;/p>
$$
x_i(t) = \sum_{j=1}^{N} J_{ij} \int_{-\infty}^{t} d\tau K(t-\tau) r_j(\tau) \quad (3)
$$
&lt;p>By taking
$K(t) = \exp(-t/\tau_s)/\tau_s$ (with
$\tau_s$ the time constant that describes the decay of the synaptic conductance, usually of the order of a few milliseconds), then (3) can be written as a differential equation&lt;/p>
$$
\tau_s \frac{dx_i(t)}{dt} = -x_i(t) + \Big( \sum_{j=1}^{N} J_{ij} r_j(t) + I_i\Big) \quad (4)
$$
&lt;p>This is actually equation (1) of the paper in which
$\tau_s$ was chosen equal to 1 for simplicity. Equations (4) and (2) give us the two parts needed to describe a firing rate model which can be simplified in two extreme cases:&lt;/p>
&lt;ul>
&lt;li>
$\tau_r \gg \tau_s$: Then
$x_i(t) = \sum_{j=1}^{N} J_{ij} r_j(t) + I_i$ and
$r$ is a low-pass of
$x(t)$.&lt;/li>
&lt;li>
$\tau_r \ll \tau_s$: Then
$r(t) = \phi(x(t))$,
$r(t)$ follows
$x(t)$ instantaneously.&lt;/li>
&lt;/ul>
&lt;p>Authors implicitly consider that we&amp;rsquo;re in the situation where
$\tau_r \ll \tau_s$ which is not obvious at all as both characteristic times seem to be of the same order of magnitude. This observation led me to explore what would have been the paper&amp;rsquo;s results if they had instead considered the situation where
$\tau_r \gg \tau_s$. This latter case yields the following equation for the system that has to be compared to equation (1):&lt;/p>
$$
\dot{r_i} = -r_i + \phi(\sum_{j=1}^N J_{ij}r_j+I_i) \quad (5)
$$
&lt;p>In the following, we will reproduce some of both theoretical and simulatory results of the paper and try to extend them to a system governed by equation (5).&lt;/p>
&lt;h3 id="networks-with-low-rank-connectivity-matrices">Networks with low-rank connectivity matrices&lt;/h3>
&lt;p>We start by placing ourselves in the same context as in the article i.e. with a negligible membrane relaxation characteristic time.&lt;/p>
&lt;p>The connectivity matrix
$J_{ij}$ is the sum of an uncontrolled random matrix
$\chi$ and of a structured low ranked known matrix
$P$.
$J_{ij}$ is thus defined by&lt;/p>
$$
J_{ij} = \underbrace{g \chi_{ij}}_{\textrm{mean $0$, variance $g^2/N$}} + \underbrace{P_{ij}}_{\textrm{of order $1/N$}}
$$
&lt;p>Note that there is no biological reason for which
$\chi_{ij}$ should have a variance scaling as
$1/N$. This constraint makes possible the comparison of networks of different sizes from a theoretical perspective (especially in the case
$N \to \infty$) and can be dealt with by adjusting the random strength
$g$ at will.&lt;/p>
&lt;h4 id="networks-with-unit-rank-structure">Networks with unit-rank structure&lt;/h4>
&lt;p>We start with
$P_{ij} = \frac{m_i n_j}{N}$ with
$m=\{m_i\}$ and
$n=\{n_j\}$ two N-dimensional vectors. Authors define two important parameters,
$g$ the random strength, and
$m^Tn/N$ the structure strength, that govern the type of dynamics of the system. The type of networks studied here is related to the Hopfield networks studied in class. However the unit-rank terms here do not require to be symmetric and can be correlated to each other. Regarding the biological plausibility of this proposed connectivity, it can be noticed that Dale&amp;rsquo;s law is not imposed here.&lt;/p>
&lt;h4 id="dynamical-mean-field-theory">Dynamical Mean-Field Theory&lt;/h4>
&lt;p>Under the assumption of a large network with a weak low-dimensional connectivity matrix (scaling as
$1/N$) one can derive the activity of each neuron thanks to dynamical mean-field theory by considering the mean and variance of the input it receives. Authors find that the average equilibrium input to unit
$i$ is denoted
$\mu_i = \kappa m_i$ with
$\kappa = \langle n_i[\phi_i] \rangle_i$, that is that the activity of the network is one dimensional, along the vector
$m$ as long as
$\kappa>0$. As
$\kappa$ represents the activity projected on vector
$n$, non-vanishing values of
$\kappa$ require a non-vanishing overlap between
$m$ and
$n$.&lt;/p>
&lt;h4 id="dynamical-mean-field-theory-extension-to-the-hahahugoshortcodes84hbhb-case">Dynamical Mean-Field Theory extension to the
$\tau_r \gg \tau_s$ case&lt;/h4>
&lt;p>Starting from equation (5), we derive a Dynamical Mean-Field approach in order to express both
$\mu_i \equiv [x_i]$ and
$\Delta_0^I \equiv [x_i^2] - [x_i]^2$. Similarly to the derivation proposed in the supplementary information of the paper, one can consider the case where
$I_i=0$
\forall i. By denoting&lt;/p>
$$
\eta_i(t) = \sum_{j=1}^N J_{ij} r_j = g \sum_{j=1}^N \chi_{ij} r_j + \frac{m_i}{N} \sum_{j=1}^N n_j r_j
$$
&lt;p>equation (5) can be rewritten as&lt;/p>
$$
\dot{r_i} = -r_i + \phi(\eta_i)
$$
&lt;p>In the stationary scenario, we would then have&lt;/p>
$$
r_i = \phi(\eta_i)
$$
&lt;p>By applying
$\phi^{-1}$ to both sides of this equation, we fall back on equation 28 of the paper that gives us an expression for
$\mu_i$ and
$\Delta_0^I$.&lt;/p>
$$
\mu_i = [x_i] = m_i \kappa
$$
$$
\Delta^I_0 = [x_i^2] - [x_i]^2 = g^2 \langle [\phi_i^2] \rangle
$$
&lt;p>Conducting a DMF analysis in the chaotic scenario is however trickier, and we do not develop it in this project although it is an interesting lead.&lt;/p>
&lt;h2 id="spontaneous-activity">Spontaneous Activity&lt;/h2>
&lt;h3 id="reproduction-of-the-papers-phase-diagram">Reproduction of the paper&amp;rsquo;s phase diagram&lt;/h3>
&lt;h4 id="figure-2-phase-diagram-obtained-from-personal-simulations-5050-resolution-left-measures-the-chaotic-nature-of-the-system-right-measures-the-structured-nature-of-the-system">Figure 2: Phase diagram obtained from personal simulations (50×50 resolution). Left: measures the chaotic nature of the system. Right: measures the structured nature of the system.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Phase diagram obtained from personal simulations" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_stat_chao_randomness_fixed_hu9affe9aeb74049e22a5a3fafe59ef926_161798_72acb47816139c21a937fe3a86ccdb6e.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_stat_chao_randomness_fixed_hu9affe9aeb74049e22a5a3fafe59ef926_161798_0c0d61a197a880f68f2a3e5986eb882e.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_stat_chao_randomness_fixed_hu9affe9aeb74049e22a5a3fafe59ef926_161798_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_stat_chao_randomness_fixed_hu9affe9aeb74049e22a5a3fafe59ef926_161798_72acb47816139c21a937fe3a86ccdb6e.webp"
width="760"
height="245"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="figure-3-left-theoretical-phase-diagram-from-the-article-right-phase-diagram-obtained-from-personal-simulations-5050-resolution">Figure 3: Left: Theoretical phase diagram from the article. Right: Phase diagram obtained from personal simulations (50×50 resolution).&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Phase diagram" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_302832_2eb9771acb9984ae9f98b85077819f45.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_302832_4ec7d2540eb4fb13d1fedc7dd7073c02.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_302832_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_302832_2eb9771acb9984ae9f98b85077819f45.webp"
width="760"
height="272"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>In figure 1 of the article, a phase diagram is proposed. The result presented shows the phase diagram obtained from theoretical results. We chose to reproduce this phase diagram thanks to simulations (fig.3). To do so, finding the good statistics to describe the chaoticity and structuration of the system is crucial. Authors hint us in using the temporal variance of
$x_i$, averaged over the number of neurons, to characterize chaoticity
$\langle \textrm{std}_t(x_i) \rangle_i$. We use the activity along
$m$, given by
$\kappa = \langle n_i[\phi_i] \rangle_i$ to characterize structure. We superimposed the phase diagrams for chaoticity and structure shown in figure 2 to obtain the one shown on the right panel of figure 3.&lt;/p>
&lt;p>Each of the 2500 pixels on the simulated phase diagram represents a simulation. The statistics were measured after a transient phase was over for 100 seconds with parameter
$\Delta t = 0.5 \textrm{s}$. Considering the ergodicity theorem, this allows us not to have to make several simulations for each pair of parameters, which would have multiplied by the same amount the time spent in simulating the transient phase.&lt;/p>
&lt;p>The biologically plausible phase is the one where there is both structured and chaotic activity.&lt;/p>
&lt;h3 id="comparison-with-the-phase-diagram-in-the-hahahugoshortcodes102hbhb-case">Comparison with the phase diagram in the
$\tau_r \gg \tau_s$ case&lt;/h3>
&lt;p>We investigate the changes that using equation (5) instead of (1) would bring to the phase diagram describing the system&amp;rsquo;s behavior. Interestingly, the phase diagram obtained for equation (5) is very similar to that of (1) (see figure 4). This was expected for the stationary part of the diagram but not necessarily for the chaotic part. This result hints us into thinking that the DMF theoretical results one would derive for our alternative model might be the same as the ones found in the paper.&lt;/p>
&lt;h4 id="figure-4-left-phase-diagram-for-hahahugoshortcodes103hbhb-right-phase-diagram-for-hahahugoshortcodes104hbhb-resolution-50--50">Figure 4: Left: Phase diagram for
$\tau_r \ll \tau_s$. Right: Phase diagram for
$\tau_r \gg \tau_s$ (resolution 50 × 50).&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Phase diagram" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_compare_hu5f5ca8b9ee743a0affe16b0442c9bb39_190585_48959268255e8193b25fa6dd4a936d13.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_compare_hu5f5ca8b9ee743a0affe16b0442c9bb39_190585_7f49cc42d668199f390cd538dd24e0e9.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_compare_hu5f5ca8b9ee743a0affe16b0442c9bb39_190585_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/phase_diag_compare_hu5f5ca8b9ee743a0affe16b0442c9bb39_190585_48959268255e8193b25fa6dd4a936d13.webp"
width="760"
height="272"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="response-to-an-external-input">Response to an external input&lt;/h2>
&lt;p>For further analysis, we aim at comparing the behavior of our alternative model to that of the paper in response to an external stimulus
$I$. As in the article, we look at the response of the system along the
$m$ vector. Figure 5 shows the transient dynamics for both models when the system is subject to an external input with the same connectivity matrix and the same initial conditions. One can see that, as expected by the theory, the stationary states are the same while the transient phases slightly differ. Indeed, the system seems to evolve more slowly for the alternative model (figure 5 right).&lt;/p>
&lt;h4 id="figure-5-transient-dynamics-in-the-hahahugoshortcodes107hbhb-left-and-hahahugoshortcodes108hbhb-right-scenarios-in-response-to-a-step-input">Figure 5: Transient dynamics in the
$\tau_r \ll \tau_s$ (Left) and
$\tau_r \gg \tau_s$ (Right) scenarios in response to a step input.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Transient dynamics" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/transient_dyn_hu5f5ca8b9ee743a0affe16b0442c9bb39_280227_c4feb0213e66a0af71cb9e5fa6f023bf.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/transient_dyn_hu5f5ca8b9ee743a0affe16b0442c9bb39_280227_953bc30688aa9a0c07a9aea118da1577.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/transient_dyn_hu5f5ca8b9ee743a0affe16b0442c9bb39_280227_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/transient_dyn_hu5f5ca8b9ee743a0affe16b0442c9bb39_280227_c4feb0213e66a0af71cb9e5fa6f023bf.webp"
width="760"
height="272"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="reproduction-of-figure-2d-of-the-article">Reproduction of figure 2.D of the article&lt;/h3>
&lt;p>Figure 6, corresponding to figure 2.D of the article, plays a major role in the four task implementations designed in the following of the paper. Reproducing this result for our alternative model basically ensures that it will be able to perform the proposed tasks as well. The reproduction of this result is presented in figure 7. The activity along vector
$m$ is recorded for 15 different intensities of the input in three case scenarios. First, in the case where
$m$,
$n$, and
$I$ are all orthogonal in respect to each other. The system hence shows no response along
$m$ as
$\kappa = \langle n_i[\phi_i] \rangle_i = 0$ (fig.7 left). Second, in the case where
$m$ and
$n$ are orthogonal and
$I$ has a component along
$n$. The network then shows some activity along
$m$ (fig.7 center). Finally, in the case where
$m$ and
$n$ have an overlap and
$I$ is colinear to the orthogonal part of
$n$ with respect to
$m$, the system shows a bistable activity along
$m$ (fig.7 right). This bistable activity vanishes when the input becomes too strong.&lt;/p>
&lt;p>The same results were observed for our alternative model, and are presented in figure 8.&lt;/p>
&lt;h4 id="figure-6">Figure 6&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Figure 2D of the article" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2D_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_268290_94fb132348cf83d2cf4fe7ff5128a064.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2D_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_268290_172299482f68a7f76c29c251766fb6d9.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2D_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_268290_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2D_art_hu5f5ca8b9ee743a0affe16b0442c9bb39_268290_94fb132348cf83d2cf4fe7ff5128a064.webp"
width="760"
height="272"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="figure-7-activity-along-hahahugoshortcodes126hbhb-as-a-function-of-input-strength-in-three-different-scenarios-in-the-hahahugoshortcodes127hbhb-scenario-left-hahahugoshortcodes128hbhb-hahahugoshortcodes129hbhb-and-hahahugoshortcodes130hbhb-are-orthogonal-center-hahahugoshortcodes131hbhb-and-hahahugoshortcodes132hbhb-are-orthogonal-and-hahahugoshortcodes133hbhb-has-a-component-along-hahahugoshortcodes134hbhb-right-hahahugoshortcodes135hbhb-and-hahahugoshortcodes136hbhb-have-an-overlap">Figure 7: Activity along
$m$ as a function of input strength in three different scenarios in the
$\tau_r \ll \tau_s$ scenario. Left:
$m$,
$n$, and
$I$ are orthogonal. Center:
$m$ and
$n$ are orthogonal, and
$I$ has a component along
$n$. Right:
$m$ and
$n$ have an overlap.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Activity along m" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_hu7de157e7b5dd6f9811db308f54d84d48_137833_d968ce58b41ed18d6a4dd971e087738a.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_hu7de157e7b5dd6f9811db308f54d84d48_137833_27b778c133d4f63197f9cdbf5f24eadf.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_hu7de157e7b5dd6f9811db308f54d84d48_137833_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_hu7de157e7b5dd6f9811db308f54d84d48_137833_d968ce58b41ed18d6a4dd971e087738a.webp"
width="760"
height="185"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="figure-8-same-as-figure-7-but-in-the-hahahugoshortcodes137hbhb-scenario">Figure 8: Same as figure 7 but in the
$\tau_r \gg \tau_s$ scenario.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Same as figure 2D but for the alternative model" srcset="
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_alt_hu7de157e7b5dd6f9811db308f54d84d48_140694_5329b3e44b28a2bbf1968140f576f1a0.webp 400w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_alt_hu7de157e7b5dd6f9811db308f54d84d48_140694_4edd4419d7434bbd4d1cb70eda8d6bff.webp 760w,
/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_alt_hu7de157e7b5dd6f9811db308f54d84d48_140694_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/paper-deep-dive-linking-connectivity-dynamics-and-computations-in-low-rank-recurrent-neural-networks-mastrogiuseppe-ostojic/2Dorth_alt_hu7de157e7b5dd6f9811db308f54d84d48_140694_5329b3e44b28a2bbf1968140f576f1a0.webp"
width="760"
height="185"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>Thanks to a carefully conducted critical reading, we noticed that when building their firing rate model, authors implicitly did not explore an alternative model in which
$\tau_r$ (the membrane characteristic time) is not negligible with respect to
$\tau_s$ (the synaptic characteristic time). This alternative model is an interesting extension as it covers a theoretical framework ignored by the authors. We showed through a short theoretical analysis that both models were equivalent in the stationary case. Finally, we showed experimentally that the alternative model we proposed in equation (5) has very similar behavior to that of the paper&amp;rsquo;s, both for spontaneous activity and in response to an external stimulus. Because this behavior analysis makes for the building block of the four task implementations, it is legitimate to think that the proposed alternative model will also be able to perform these tasks. The mini-project conducted here therefore shows that the low-rank recurrent neural networks studied in this paper are even more biologically plausible than they claim to be.&lt;/p>
&lt;hr>
&lt;h3 id="references">References&lt;/h3>
&lt;ul>
&lt;li>[1] Francesca Mastrogiuseppe and Srdjan Ostojic. Linking connectivity, dy-namics and computations in low-rank recurrent neural networks. Neuron, 99(3):609–623.e29, August 2018. arXiv: 1711.09672.&lt;/li>
&lt;li>[2] Peter Dayan and L. F. Abbott. Theoretical neuroscience: computational and mathematical modeling of neural systems. Computational neuro-science. Massachusetts Institute of Technology Press, Cambridge, Mass, 2001.&lt;/li>
&lt;/ul></description></item><item><title>Data challenge : Can you predict the tide ?</title><link>https://hugoninou.netlify.app/post/test/</link><pubDate>Mon, 01 May 2023 07:04:36 +0000</pubDate><guid>https://hugoninou.netlify.app/post/test/</guid><description>&lt;h2 id="table-of-contents">Table of Contents&lt;/h2>
&lt;ol>
&lt;li>
&lt;p>&lt;a href="#introduction">Introduction&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#data-analysis">Data Analysis&lt;/a>&lt;/p>
&lt;ol>
&lt;li>&lt;a href="#heat-maps">Heat Maps&lt;/a>&lt;/li>
&lt;li>&lt;a href="#autocorrelation-of-the-tidal-surplus-over-time">Autocorrelation of the Tidal Surplus Over Time&lt;/a>&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#development-of-a-recurrent-neural-network-rnn">Development of a Recurrent Neural Network (RNN)&lt;/a>&lt;/p>
&lt;ol>
&lt;li>&lt;a href="#the-encoder-decoder-model">The Encoder-Decoder Model&lt;/a>&lt;/li>
&lt;li>&lt;a href="#splitting-the-data-into-training-set-and-testing-set">Splitting the Data into Training Set and Testing Set&lt;/a>&lt;/li>
&lt;li>&lt;a href="#naive-implementation-without-using-pressure-fields">Naive Implementation Without Using Pressure Fields&lt;/a>&lt;/li>
&lt;li>&lt;a href="#naive-implementation-using-all-pressure-and-wind-fields">Naive Implementation Using All Pressure and Wind Fields&lt;/a>&lt;/li>
&lt;li>&lt;a href="#implementation-with-dimensionality-reduction-of-pressure-and-wind-fields">Implementation with Dimensionality Reduction of Pressure and Wind Fields&lt;/a>&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#conclusion">Conclusion&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>This project is part of the course Information and Complexity and aims to address the challenge &amp;lsquo;Can you predict the tide?&amp;rsquo;. The challenge, proposed by the FLUMINANCE team at Inria, consists of predicting the tidal surplus based on past measurements and pressure fields. A direct application of this challenge is to provide more efficient responses to extreme tidal events. In this report, we detail our approach, which is carried out in two stages: we first present key elements of the data analysis, and then propose a relevant implementation of an &lt;em>Encoder-Decoder&lt;/em> model to meet this challenge.&lt;/p>
&lt;h2 id="data-analysis">Data Analysis&lt;/h2>
&lt;h3 id="heat-maps">Heat Maps&lt;/h3>
&lt;p>To propose the most relevant algorithms to solve the given problem, we started by studying the dataset. A first approach consists of performing a linear regression of the tidal surplus at a given time based on each point of the pressure field at the same time. Prior to this, the values of the pressure field were z-scored. The correlation between the tidal surplus and the pressure field at a point provides us with the heatmap in Figure 1. Note that there are 5 times more pressure fields than time points of tidal surplus. Therefore, linear regression is done by taking the pressure field closest in time to the tidal surplus. Blue areas correspond to points where a depression induces a positive tidal surplus, and conversely, red areas show the opposite. From this simple analysis, we can already suggest that City 1 is likely located in the Southeast quadrant of the map, while City 2 is likely in the Northwest quadrant.&lt;/p>
&lt;p>We can proceed similarly using the spatial derivatives of the pressure field, which we refer to as horizontal wind (derivative along the x-axis) and vertical wind (derivative along the y-axis). The results are shown in Figures 2 and 3. Overall, the horizontal wind appears to carry more information about the tidal surplus (the heat maps for the vertical wind contain many coefficients close to zero).&lt;/p>
&lt;h4 id="figure-1-heatmap-of-the-correlation-between-the-centered-and-reduced-pressure-field-and-the-tidal-surplus-for-city-1-left-and-city-2-right">Figure 1: Heatmap of the correlation between the centered and reduced pressure field and the tidal surplus for City 1 (left) and City 2 (right).&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="heatmap1_p.png" srcset="
/post/test/heatmap1_p_hu4854368521544377154bb2cdeb8faa55_654409_573e23f910b66e8fe09bf0979681c956.webp 400w,
/post/test/heatmap1_p_hu4854368521544377154bb2cdeb8faa55_654409_e5c0029b6ac4a01315262df8632483d9.webp 760w,
/post/test/heatmap1_p_hu4854368521544377154bb2cdeb8faa55_654409_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/heatmap1_p_hu4854368521544377154bb2cdeb8faa55_654409_573e23f910b66e8fe09bf0979681c956.webp"
width="760"
height="321"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="figure-2-heatmap-of-the-correlation-between-the-centered-and-reduced-horizontal-wind-and-the-tidal-surplus-for-city-1-left-and-city-2-right">Figure 2: Heatmap of the correlation between the centered and reduced horizontal wind and the tidal surplus for City 1 (left) and City 2 (right).&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="heatmap1_wh.png" srcset="
/post/test/heatmap1_wh_hu4854368521544377154bb2cdeb8faa55_652888_0498c0a180deb32680f098d3da2abe68.webp 400w,
/post/test/heatmap1_wh_hu4854368521544377154bb2cdeb8faa55_652888_8ea6b5c357d6402b2a4229f338d860cd.webp 760w,
/post/test/heatmap1_wh_hu4854368521544377154bb2cdeb8faa55_652888_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/heatmap1_wh_hu4854368521544377154bb2cdeb8faa55_652888_0498c0a180deb32680f098d3da2abe68.webp"
width="760"
height="321"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="figure-3-heatmap-of-the-correlation-between-the-centered-and-reduced-vertical-wind-and-the-tidal-surplus-for-city-1-left-and-city-2-right">Figure 3: Heatmap of the correlation between the centered and reduced vertical wind and the tidal surplus for City 1 (left) and City 2 (right).&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="heatmap1_wv.png" srcset="
/post/test/heatmap1_wv_hu4854368521544377154bb2cdeb8faa55_638212_b7fc450afa5c3f61e97664cc2419b731.webp 400w,
/post/test/heatmap1_wv_hu4854368521544377154bb2cdeb8faa55_638212_72a54596121f22799269ed8f9315ca7a.webp 760w,
/post/test/heatmap1_wv_hu4854368521544377154bb2cdeb8faa55_638212_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/heatmap1_wv_hu4854368521544377154bb2cdeb8faa55_638212_b7fc450afa5c3f61e97664cc2419b731.webp"
width="760"
height="321"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="autocorrelation-of-the-tidal-surplus-over-time">Autocorrelation of the Tidal Surplus Over Time&lt;/h3>
&lt;p>We have briefly discussed the information that the pressure fields provide regarding the tidal surplus in Cities 1 and 2. In this section, we analyze the information that the time series of the tidal surplus carries about itself. One way to do this is by looking at the autocorrelation of the tidal surplus over time for Cities 1 and 2. After reordering the tidal surplus data in chronological order, we plotted the autocorrelation curves shown in Figure 4. One limitation of this approach is that the time intervals between consecutive points are irregular. On average, the interval is 9 hours, but about two-thirds of the points are evaluated at the same time as the previous point, likely due to an expansion of the dataset. Nonetheless, it is still possible to observe that the tidal surplus is quite autocorrelated over time, with a more marked correlation for City 2 than for City 1. Therefore, it will likely be much easier to infer accurate results for City 2 than for City 1, which is indeed what we observe when separating the prediction scores for each city.&lt;/p>
&lt;h4 id="figure-4-autocorrelation-of-the-tidal-surplus-for-city-1-left-and-city-2-right">Figure 4: Autocorrelation of the tidal surplus for City 1 (left) and City 2 (right).&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="autocorr1.png" srcset="
/post/test/autocorr1_hub25891ca9f89e8b991d0b2d4ad328dbf_477894_13cb2d660e5c272fbc03e773f469a84d.webp 400w,
/post/test/autocorr1_hub25891ca9f89e8b991d0b2d4ad328dbf_477894_fb1cadca173cb695daff717ed348f858.webp 760w,
/post/test/autocorr1_hub25891ca9f89e8b991d0b2d4ad328dbf_477894_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/autocorr1_hub25891ca9f89e8b991d0b2d4ad328dbf_477894_13cb2d660e5c272fbc03e773f469a84d.webp"
width="760"
height="276"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="development-of-a-recurrent-neural-network-rnn">Development of a Recurrent Neural Network (RNN)&lt;/h2>
&lt;p>The development of artificial neural networks for solving prediction problems in meteorology and climatology is relatively recent. Here, we draw inspiration from two articles that use an &lt;em>Encoder-Decoder&lt;/em> architecture [1][3], which have demonstrated state-of-the-art performance for time series prediction.&lt;/p>
&lt;h3 id="the-encoder-decoder-model">The Encoder-Decoder Model&lt;/h3>
&lt;p>The architecture of the &lt;em>Encoder-Decoder&lt;/em> model that we implemented is shown in Figure 5. It consists of a sequence of recurrent cells (GRU cells) that take as input a vector from a time series and pass a &amp;ldquo;hidden vector&amp;rdquo; encoding the system state. Once the entire input sequence has been given to the encoder, the final &amp;ldquo;hidden vector&amp;rdquo; is passed to a decoder made of recurrent cells (decoder cells), which take the tidal surplus at time (t) as input and predict the tidal surplus at time (t+1). The implementation is done in Python using the Pytorch library.&lt;/p>
&lt;p>Regarding recurrent cells, several choices can be made. The basic cell consists of a simple matrix multiplication followed by the application of a non-linear function. This approach presents issues with learning (gradient explosion or vanishing gradients [4]). To address this, LSTM cells were developed. GRU cells are an alternative, offering fewer parameters while providing identical performance.&lt;/p>
&lt;p>The gradient descent is performed using the Adam [5] method with L2 regularization.&lt;/p>
&lt;h4 id="figure-5-architecture-of-the-implemented-encoder-decoder-model">Figure 5: Architecture of the implemented Encoder-Decoder model.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="encoder_decoder.png" srcset="
/post/test/encoder_decoder_hucdd7647fc46fa05091c1dc71d444978d_178798_a794a4751c9949856acb3840a85acf2b.webp 400w,
/post/test/encoder_decoder_hucdd7647fc46fa05091c1dc71d444978d_178798_d4cdc1b217017ed03b3ad10661291158.webp 760w,
/post/test/encoder_decoder_hucdd7647fc46fa05091c1dc71d444978d_178798_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/encoder_decoder_hucdd7647fc46fa05091c1dc71d444978d_178798_a794a4751c9949856acb3840a85acf2b.webp"
width="760"
height="263"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="splitting-the-data-into-training-set-and-testing-set">Splitting the Data into Training Set and Testing Set&lt;/h3>
&lt;p>We started by dividing the dataset into a training set (90%) and a testing set (10%) randomly. However, large discrepancies were observed between the testing set score we created and that of the challenge. This was due to the fact that the dataset contains several points that are actually very close or sometimes overlapping in time, causing overfitting on the training set to affect the testing set. To remedy this, we split the dataset into blocks of consecutive time points that are thus correlated over time (refer to Figure 4).&lt;/p>
&lt;h3 id="naive-implementation-without-using-pressure-fields">Naive Implementation Without Using Pressure Fields&lt;/h3>
&lt;p>A first attempt was made by using the neural network as presented in Figure 5, but only taking the time series of tidal surplus as input and ignoring the pressure and wind fields. To minimize the score, we fine-tuned the network&amp;rsquo;s parameters, such as the number of layers within each cell, the size of the hidden vector, and the dropout rate (randomly ignoring a certain percentage of parameters during training). We found that with two layers, a hidden vector of size 25, and a dropout rate of 20%, we achieved a minimum score of 0.63. The evolution of the test score during training is shown in Figure 6.&lt;/p>
&lt;h4 id="figure-6-scores-for-the-training-set-and-test-set-over-the-epochs-of-learning-in-the-naive-implementation-without-using-pressure-fields">Figure 6: Scores for the training set and test set over the epochs of learning in the naive implementation without using pressure fields.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="losses_no_slp.png" srcset="
/post/test/losses_no_slp_hu848aa20c091905cdfe472b7f9137f9f6_342438_5a22cefc37c4ac439a6b99cc1711557e.webp 400w,
/post/test/losses_no_slp_hu848aa20c091905cdfe472b7f9137f9f6_342438_aa894d05c5056d0ba6bc22826b071879.webp 760w,
/post/test/losses_no_slp_hu848aa20c091905cdfe472b7f9137f9f6_342438_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/losses_no_slp_hu848aa20c091905cdfe472b7f9137f9f6_342438_5a22cefc37c4ac439a6b99cc1711557e.webp"
width="760"
height="296"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="naive-implementation-using-all-pressure-and-wind-fields">Naive Implementation Using All Pressure and Wind Fields&lt;/h3>
&lt;p>A second approach involved constructing a large vector comprising the tidal surplus in each city, along with the pressure and wind fields at the closest time (see Figure 7). The results obtained in this case are shown in Figure 8, using four layers, a hidden vector of size 200, and a dropout rate of 50%. This network achieved a minimum score of 0.57, which is a 5% improvement over the previous method but at the cost of a 25,000-fold increase in the number of parameters. A significant limitation of this implementation is the tendency for the model to overfit, despite L2 regularization, which forced us to use a very high dropout rate.&lt;/p>
&lt;h4 id="figure-7-architecture-with-expanded-input-vectors-including-pressure-and-wind-fields">Figure 7: Architecture with expanded input vectors including pressure and wind fields.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="encoder_decoder_plus.png" srcset="
/post/test/encoder_decoder_plus_hu2a5c75e5f7e652be38ee83f66f3cca62_192050_a050b0cc6c3863fac23ccaa625c14095.webp 400w,
/post/test/encoder_decoder_plus_hu2a5c75e5f7e652be38ee83f66f3cca62_192050_fcd8b85aed6634f0066c851785f9a805.webp 760w,
/post/test/encoder_decoder_plus_hu2a5c75e5f7e652be38ee83f66f3cca62_192050_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/encoder_decoder_plus_hu2a5c75e5f7e652be38ee83f66f3cca62_192050_a050b0cc6c3863fac23ccaa625c14095.webp"
width="760"
height="265"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="figure-8-scores-for-the-training-set-and-test-set-over-the-epochs-of-learning-when-using-all-pressure-and-wind-fields">Figure 8: Scores for the training set and test set over the epochs of learning when using all pressure and wind fields.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="losses_whole.png" srcset="
/post/test/losses_whole_hu848aa20c091905cdfe472b7f9137f9f6_284236_849bdafb5f1fefb5188cc95df97f604e.webp 400w,
/post/test/losses_whole_hu848aa20c091905cdfe472b7f9137f9f6_284236_90b43e92cbd1b988502ba3c12d954fa2.webp 760w,
/post/test/losses_whole_hu848aa20c091905cdfe472b7f9137f9f6_284236_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/losses_whole_hu848aa20c091905cdfe472b7f9137f9f6_284236_849bdafb5f1fefb5188cc95df97f604e.webp"
width="760"
height="296"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="implementation-with-dimensionality-reduction-of-pressure-and-wind-fields">Implementation with Dimensionality Reduction of Pressure and Wind Fields&lt;/h3>
&lt;p>Finally, we propose a less naive implementation of the neural network that leverages our data analysis. Instead of feeding the entire pressure and wind fields into the model, we compute a dot product between these fields and their associated heat maps. A field of size (41 x 41=1681) is reduced to a vector of size 2 (one dimension per city). The input vector&amp;rsquo;s size is thus reduced from 5045 to 8. With four layers for the recurrent cells and a hidden vector of size 25, we achieved a score of 0.5 (Figure 9, right), representing an 8% improvement over the naive method that included raw pressure and wind fields. This proposed method is more efficient and resource-conservative, requiring 5000 times fewer parameters.&lt;/p>
&lt;h4 id="figure-9-scores-for-the-training-set-and-test-set-over-the-epochs-of-learning-with-dimensionality-reduction-of-pressure-and-wind-fields">Figure 9: Scores for the training set and test set over the epochs of learning with dimensionality reduction of pressure and wind fields.&lt;/h4>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="losses_with_slp_and_wind.png" srcset="
/post/test/losses_with_slp_and_wind_hu9a1788f378eb2a7f9710a6491658e0ca_438929_5cead15d35edcde87eee0eb9968e2ecd.webp 400w,
/post/test/losses_with_slp_and_wind_hu9a1788f378eb2a7f9710a6491658e0ca_438929_1174512259b2c8bc812c2de50122ff23.webp 760w,
/post/test/losses_with_slp_and_wind_hu9a1788f378eb2a7f9710a6491658e0ca_438929_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://hugoninou.netlify.app/post/test/losses_with_slp_and_wind_hu9a1788f378eb2a7f9710a6491658e0ca_438929_5cead15d35edcde87eee0eb9968e2ecd.webp"
width="760"
height="251"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Note that we are only using one-fifth of the available pressure fields. An extension to consider would be to not only use the pressure field closest in time to the tidal surplus measurement but also include the 4 previous fields, which could provide additional information. In this case, it would be necessary to compute heatmaps showing the correlation between the tidal surplus at time (t) and these fields at times (t-1), (t-2), (t-3), and (t-4).&lt;/p>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>Based on a thorough data analysis, we proposed an &lt;em>Encoder-Decoder&lt;/em> model to tackle this challenge. In the context of the &lt;em>Encoder-Decoder&lt;/em> model, the issue of compressing the information contained in the pressure field is central. Feeding the network with raw pressure fields leads to overfitting problems, which are difficult to resolve. The method we propose, which reduces entire fields to a single scalar, may be extreme but offers good explainability and yields very good results. The optimal approach likely lies somewhere between these two extremes, avoiding translation-invariant compression methods, as the location of depressions and winds is crucial for solving the problem.&lt;/p>
&lt;hr>
&lt;h3 id="references">References&lt;/h3>
&lt;ul>
&lt;li>[1] Sebastian Scher and Gabriele Messori. Weather and climate forecasting with neural networks: using general circulation models (GCMs) with dif- ferent complexity as a study ground. &lt;em>Geoscientific Model Development&lt;/em>, 12(7):2797–2809, July 2019.&lt;/li>
&lt;li>[2] Nathawut Phandoidaen and Stefan Richter. Forecasting time series with encoder-decoder neural networks. &lt;em>arXiv:2009.08848 [math, stat]&lt;/em>, September 2020. arXiv: 2009.08848.&lt;/li>
&lt;li>[3] Encoder-Decoder Model for Multistep Time Series Forecasting Using Py- Torch | by Gautham Kumaran | Towards Data Science.&lt;/li>
&lt;li>[4] Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training Recurrent Neural Networks. &lt;em>arXiv:1211.5063 [cs]&lt;/em>, February 2013. arXiv: 1211.5063.&lt;/li>
&lt;li>[5] Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Opti- mization. &lt;em>arXiv:1412.6980 [cs]&lt;/em>, January 2017. arXiv: 1412.6980.&lt;/li>
&lt;/ul></description></item></channel></rss>