\documentclass[fleqn,11pt,a4page]{article}
\pagestyle{plain}
%\usepackage{amsfonts}
\usepackage{amsmath}
%\usepackage{german}
\usepackage{color}%[dvips]
\definecolor{darkgreen}{rgb}{0.1,0.5,0.1}
\definecolor{darkblue}{rgb}{0.0,0.0,0.5}
\usepackage{hyperref}
\hypersetup{colorlinks,%
citecolor=darkblue,%
filecolor=black,%
linkcolor=darkgreen,%
urlcolor=black,%
pdftex}
\usepackage[pdftex]{graphicx} %
\usepackage[round, authoryear]{natbib}
\usepackage{psfrag}
\usepackage[latin1]{inputenc}
\usepackage{verbatim}
%\bibpunct{(}{)}{,}{a}{}{;}
\newcommand{\caps}[1]{\caption{\small #1}}
%\title{Bachelor's Thesis\\ Quantitative analysis of stable representations of disparity in natural visual images}
%\title{Bachelor's Thesis\\ Can binocular complex cells be described as forming stable representations of natural stereo-images?}
%\title{Quantitative analysis of disparity tuning properties of optimally stable representations obtained with natural visual images}
%Quantitative properties of sparse binocular representations of natural visual images
\title{Quantitative comparison of stable representations of natural stereo-images and binocular complex cells}
\author{Sebastian Bitzer\\sbitzer@uos.de}
\begin{document}
\nocite{chalupa2003}
\begin{center}
\vspace*{100pt}
%\maketitle
{\LARGE Quantitative comparison of stable representations of natural stereo-images and binocular complex cells}\ \\
\vspace*{20pt}
{\LARGE Bachelor's Thesis}
\vspace*{20pt}
Sebastian Bitzer (sbitzer@uos.de)\ \\
Cognitive Science Program\ \\
University of Osnabrück\ \\
\ \\
Supervised by\\Professor Peter König\ \\
\ \\
\today
\vspace*{150pt}
\begin{figure}[htp]
\begin{center}
\includegraphics[width=150pt]{fig/uos.pdf}
\end{center}
\end{figure}
\end{center}
\thispagestyle{empty}
\newpage
\begin{abstract}
Complex cells in primary visual cortex exhibit particular spatial properties such as tuning for stimulus orientation and spatial frequency independent of precise stimulus position in the receptive field. Recently it has been shown that these neurons share important such properties with simulated cells, which were adapted to exhibit optimally stable activity to natural visual stimuli. Consequently, it has been suggested that complex cells can be described as forming optimally stable representations of their natural input.
The simulation from this study has now been extended into the 3D domain by optimising activities of simulated cells to binocular cat-cam video sequences of natural scenes and in the work at hand I compare properties of the resulting stable cells to those of complex cells in order to evaluate whether a similar conclusion as from the simulation with monocular stimuli can be drawn. Thereto I directly analyse subunit receptive fields, examine binocular interaction profiles and investigate responses to random-dot stereograms while mainly contrasting my findings to physiological results obtained by the groups of Ralph Freeman and Andrew Parker.
Similar to the monocular study I find that simulated and real binocular neurons have many properties in common. Stable cells again exhibit tuning for orientation and spatial frequency as well as position invariance. Furthermore, all of the simulated cells are disparity selective and their disparity tuning curves show characteristics comparable to those of striatal neurons. Moreover, a predominance of phase encoding can also be observed in both systems.
Nevertheless, I notice several differences between stable and complex cells. Many of them are due to the limitations of the cell model employed in the simulation, which for example does not allow consistent inhibitory input from one eye like it is found in some real neurons. One crucial discrepancy, which I cannot explain, is a greater number of stable cells preferring large phase disparities near $\pm\pi$. I propose that this may result from properties of the used natural stimuli.
In conclusion, I find striking similarities between stable and complex cells signifying that \emph{binocular} complex cells can be described as forming optimally stable representations of natural visual input as well. However, the cell model should be adapted to incorporate recent advances in the modelling of striatal neurons and further investigations need to be done to explain open discrepancies concerning phase disparity.
\end{abstract}
\thispagestyle{empty}
\newpage
\pagenumbering{arabic}
\tableofcontents
\newpage
\section{Introduction}
Cognition is one of the most amazing phenomena in biology. In the neurosciences we assume that cognitive functions such as seeing are implemented by the neural circuitry in the brain. This leads to the questions how this circuitry is formed and how single neurons contribute to it.
Due to the limited amount of information, which can be encoded in the genome of an organism, the details of neural organisation have to be learnt from the environment. Therefore in order to understand the circuitry in the brain it is important to know the relation between neurons and their natural stimuli. Accordingly, it has been shown that certain neuronal activities can be described as forming representations of natural stimuli which are optimally sparse \citep{olshausen1996} or optimally stable \citep{koerding2004}.
Here I quantitatively analyse whether activities of complex cells in primary visual cortex (V1) to binocular stimuli can be described as forming optimally stable representations of these stimuli, too. Thereto, I compare properties of optimally stable cells obtained from a simulation to those of striatal cells on three different levels of analysis. I begin with a direct analysis of complex cell constituents, which are then reinvestigated indirectly via response properties of the cells and finally I examine activities of cells to binocular stimuli themselves.
The following two subsections introduce the topic of binocular cells in V1 and the simulation, which has been used to produce optimally stable representations of natural stimuli. In further sections I present my analysis and discuss these results.
\subsection{Binocular complex cells in primary visual cortex}
The studies mentioned in the following text have been done with cats or primates. I am interested in the more abstract properties of neuronal activation and assume that these are comparable in the early stage of cortical processing which I am looking at in the two species. This this goes along with my central hypothesis that neuronal activities are adapted to natural stimulus statistics, which should be roughly equal for these two species, too.
Already \citet{hubel1962} showed in their fundamental work that cells in V1 can be classified into two distinct categories according to their firing properties to bar or grating stimuli. Whereas so-called simple cells selectively respond to bars of a certain orientation, spatial frequency, velocity and contrast polarity at a certain position in their receptive field (RF), so-called complex cells are invariant to local contrast and stimulus position while still maintaining specificity to spatial frequency, orientation and velocity.
Additionally, large subsets of both classes of cells are selective to binocular disparity, which is possible, because V1 is the first stage in the visual pathway where neurons receive significant input from left and right eye. Binocular disparity describes small differences between images of stimuli on the two retinae which result from the horizontal separation of the two eyes. Projections of stimuli outside of the fixation point will not fall on corresponding points of the retinae when the stimuli lie in different depth. These deviations from corresponding points are defined as disparity. Here I follow the common convention that far stimuli have positive disparity and near stimuli have negative disparity (see figure \ref{fig:disp} for illustration).
\begin{figure}[hp]
\begin{center}
\includegraphics[width=300pt]{fig/disparity.jpg}
\caps{Schematic depiction of the geometry of horizontal binocular disparity; Images of fixation point F fall on corresponding points of the retinae. The difference between corresponding point and actual point of a stimulus projection onto one retina is defined as disparity. Thus in the fixation point there is zero disparity, stimuli behind the fixation point (far) are defined to have positive disparity and stimuli in front of the fixation point (near) have negative disparity. Near and far stimuli can be simulated by shifting left and right stimulus presentations on a frontoparallel plane (2D) instead of shifting in depth (3D). Far stimuli are sometimes said to produce uncrossed disparities and near stimuli to produce crossed disparities, because for near stimuli these left and right shifts have to cross each other. Vieth-M\"uller circle: theoretical circle consisting of all points whose images fall on corresponding points of both retinae; figure adapted from \citet{gonzalez1998}}
\label{fig:disp}
\end{center}
\end{figure}
Binocular simple cells can often be modelled as a linear binocular filter followed by a static nonlinearity \citep{anzai1999a,chalupa2003,freeman1990}. This means that it is possible to easily predict the response of a simple cell to binocular stimuli from the responses to monocular stimuli. However, this is not true for complex cells which exhibit substantial nonlinearities in their responses to binocular stimuli \citep{deangelis2003}. Accordingly, binocular interaction profiles which describe the activity of cells to bars presented independently to the two eyes\footnote{for further explanations and examples see section \ref{sec:anzai}}, show elongated regions along lines of constant disparity \citep{ohzawa1990, ohzawa1997, anzai1999b}, that is, these cells are activated for stimuli at all positions in their receptive field as long as the disparity is appropriate.
% energy model
\citet{hubel1962} proposed a hierarchical model in which several simple cells with different optimal stimulus positions and else equal RF properties feed into one complex cell and thereby make it position invariant. But it has long been noted that it need not be actual simple cells, which constitute the input to complex cells in order to produce the reported nonlinearities.\footnote{see \citet{mechler2002} for a criticism of classification into simple and complex cells in general} Thus \citet{ohzawa1990, ohzawa1997} suggest a more abstract hierarchical model in which the outputs of functional subunits are combined to produce the activity of a complex cell. This so-called binocular energy model consists of four binocular subunits with simple cell-like receptive fields. Each subunit is modelled as a linear binocular filter followed by a half-squaring nonlinearity. The first and second and the third and fourth subunits can be seen as units themselves. The subunit RFs within each of the two units are sign-inverted versions of each other while RFs across units stand in quadrature phase relationship which means that the phases of the underlying Gabor functions\footnote{for specification of Gabor functions see section \ref{sec:gabor}} are 90$^\circ$ apart. Thereby the sign-inversion leads to contrast invariance while the quadrature relationship results in position invariance. Left and right RFs of subunits have equal preferences for orientation and spatial frequency and differ only in properties responsible for disparity tuning of the cell. This is in good agreement with physiological findings \citep{hubel1962, ohzawa1996}.
For simulations the four-subunit model can be substituted with an equivalent two-subunit model in which every unit consists of only one subunit, which does full-squaring instead of half-squaring. An example of such a binocular energy model is depicted in figure \ref{fig:bem}A.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=\textwidth]{fig/bem_pos-phase.jpg}
\caps{Energy model and disparity encoding. \textbf{A} Adapted binocular energy model used in our simulation. The model consists of two subunits which are binocular linear filters followed by a static nonlinearity (full-squaring). The output of the complex cell is a linear combination of subunit outputs followed by a square root. For the energy model to act as a complex cell \citet{ohzawa1990, ohzawa1997} proposed that the subunit RFs stand in quadrature phase relationship as depicted here. Further left and right RFs should be equal except for properties defining selectivity for disparity. \textbf{B} Mechanisms of encoding disparity. Exemplary 1D profiles of RFs taken along a line orthogonal to orientation are shown. In position model left and right profiles are equal but at different positions. In phase model left and right profiles are at corresponding positions, but have different shapes (profiles with phase shift). figure B adapted from \citet{ohzawa1997}}
\label{fig:bem}
\end{center}
\end{figure}
% position vs. phase
Two possible mechanisms for encoding disparity have been identified \citep{ohzawa1996,ohzawa1997,anzai1999} and are demonstrated in figure \ref{fig:bem}B. The position model assumes that the structure of left and right RFs is the same, but that the position of the two RFs is different and this difference then defines the preferred disparity of the cell. The phase model, on the other hand, assumes that the positions of the RFs are equal, but that RF structures differ in the two eyes by a given phase shift which in turn also defines optimal disparity. Evidence for both mechanisms has been found in physiological studies \citep{anzai1999,anzai1999b} and indeed the response of some cells can only be explained by a combination of both mechanisms \citep{prince2002}.
In my analysis I will present some of the studies mentioned here, especially \citet{anzai1999b} and \citet{prince2002}, in more detail in order to compare their findings with data obtained from our simulation.
% tuning types (Poggio 88)
%The visual system is often described as a hierarchy of feature detectors where on each higher stage of processing stimuli which effectively modulate the activity of a neuron get more complex \citep{KobTan94, RiePog02}.
%The region of the visual field in which presented stimuli influence the activity of a cell is defined as its receptive field (RF) \citep{DeaAnz03}. It is shown that these receptive fields get larger along the visual pathway and the stimuli which effectively modulate the activity of a neuron get more complex (increasing specificity) \citep{KobTan94} while at the same time higher neurons do not differentially respond to simpler features of the stimuli any more (increasing invariance) \citep{chalupa2003}.
\subsection{Optimally stable representations of natural visual stimuli}
\citet{koerding2004} report that optimally stable representations of natural visual stimuli resemble many properties of complex cells very well. These properties include spatial frequency and orientation tuning and invariance to stimulus translation and contrast polarity. Selim Onat extended their framework to incorporate binocular stimuli and I assess his results about whether optimally stable representations can describe binocular properties of cells in V1, too. Details of the following brief description of the simulation can be read in \citet{onat2005}.
Natural visual stimuli have been obtained by recording image sequences from two cameras which were mounted in parallel on the head of a freely moving cat. This has yielded a pair of left and right natural images for each frame of the video whereby the two images differ a little bit due to the spatial separation of the two cameras (4.8cm). Because the theoretical fixation point of the two cameras lies in infinity, only near disparities should have been sampled, but a normalisation technique has been used to obtain all kinds of disparities. Always four 20x20 pixel (px) sized patches have been extracted from a randomly chosen position within the raw images (left and right patch of frame $t$ and left and right patch of frame $t+1$, see figure \ref{fig:stim}). For computational efficiency reasons these patches have not been the input for the simulation, but their representations in the space of their first 100 principal components\footnote{actually components 2-101: mean has been taken out}, which already explained $>95\%$ of their variance.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/stimuli.jpg}
\caps{Description of stimuli. \textbf{A} Setup of the cameras on a cats head. \textbf{B} Example of a raw image recorded with one camera. Image is converted to greyscale before further processing. \textbf{C} Schematic representation of choice of patches for one stimulus configuration within optimisation. From left and right images of frame $t$ and $t+1$ one patch is chosen from a randomly drawn common position.}
\label{fig:stim}
\end{center}
\end{figure}
The simulation is based on the cell model as depicted in figure \ref{fig:bem}A which I already introduced above. Thus the activity of a simulated neuron is calculated in the first step for the first subunit by taking the inner product between left patch, $I^l$, and left weights, $W_1^l$, and right patch, $I^r$, and right weights, $W_1^r$. Further steps are defined by
\begin{equation}
\label{eqn:act}
A = \sqrt{(W_1^lI^l+W_1^rI^r)^2 + (W_2^lI^l+W_2^rI^r)^2}.
\end{equation}
This definition of activation allows for many different response properties depending on the weights $W$, which I will denominate RFs of the simulated neurons in the following. As mentioned earlier \citet{ohzawa1990} proposed that the simulated cells are complex-like, if the RFs of the subunits resemble two-dimensional Gabor functions which stand in quadrature phase relationship. I will examine this property in section \ref{sec:subunit}.
A given set of 100 cells should act optimally stable which means that the activity of a cell should change as little as possible within one time step. This objective is motivated by the fact that relevant features within the visual field, like certain objects, do not change as rapidly as low-level features like luminance in a limited region of the visual field. To obtain stable representations of the stimuli the activity of the cells to the given patches has been optimised with respect to an objective function. A cell, which does not react to any stimulus, would be most stable. Obviously this is not wanted and to prevent this a term has been included in the stability objective, which reinforces some variability in the overall responses of the cell. Furthermore a decorrelation term has been utilised to penalise common responses of the cells, what should lead to different response patterns for each cell. Thus, in the simulation the following objective function has been maximised
\begin{equation}
\Psi = \underbrace{-\sum_{i=1}^N\frac{\langle (A_i(t) - A_i(t+\Delta t))^2 \rangle}{\sigma_{ii}}}_{\Psi_{stable}} - \underbrace{\frac{1}{(N-1)^2} \sum_{i \neq j}\sigma_{ij}^2}_{\Psi_{decorr}}
\end{equation}
where $N$ is the number of cells, $\langle\rangle$ denotes the average over stimuli and thus over time, $A_i(t)$ is activity of neuron $i$ at time $t$ and $\sigma_{ij}$ is the covariance over stimuli between cells $i$ and $j$.
Weights for each subunit of each simulated neuron have been obtained as a result of the converged optimisation process, which optimally stable represent (binocular) properties of the shown natural stimuli. In the following I have analysed these artificial RFs in several ways as explained in the methods section (\ref{sec:methods}) to attain quantitative descriptions of RF properties comparable to physiological data.
\newpage
\section{Methods}
\label{sec:methods}
Different physiological studies often employ different analyses while investigating the same or very similar subjects. Thus, for comparison of our simulation data with a wide range of physiological data I had to apply several different analysis techniques which are described in detail below. A basic RF analysis has been done by fitting 2D-Gabor functions to the RFs of subunits. Further I estimated the disparity tuning properties of the simulated neurons with random-dot stereograms (RDS) and fitting of the resulting tuning curves with 1D-Gabor functions. At last I obtained binocular interaction profiles of our cells, which I decomposed into components by a singular value decomposition, whose properties I in turn analysed with 1D-Gabor fits. All programming has been done in MATLAB$^\mathrm{\textregistered}$ (The MathWorks, Inc.).
\subsection[Fitting 1D and 2D-Gabor functions]{Fitting 1D and 2D-Gabor functions to disparity tuning curves and RFs, respectively}
\label{sec:fitting}
Curve fitting can be used to get insights in the given data by describing the data with parameters of a certain model. My data are one dimensional disparity tuning curves and two dimensional RFs and I want to express these in terms of Gabor functions% \footnote{see section \ref{sec:basic} for justification of the use of Gabor functions}
, which are nonlinear in their parameters. Therefore, I use a nonlinear least squares method to fit the Gabors to my data. Details of the fitting procedure can be found in the corresponding sections below.
\subsubsection{Gabor functions}
\label{sec:gabor}
Gabor functions are defined as the product of a Gaussian and a sinusoidal component \citep{gabor1946}. For 1D-Gabor functions I use the definition
\begin{equation}
G(x) = A_o + A \cdot \exp\left(\frac{-(x-x_0)^2}{2W^2}\right) \cdot \cos(2\pi f(x-x_0) + \phi),
\end{equation}
where $A_o$ is the amplitude offset, $A$ is the amplitude, $x_0$ is the centre of the Gaussian envelope, $W$ is the width of the Gaussian, $f$ is the frequency of the sinusoid and $\phi$ describes its phase. In two dimensions another Gaussian envelope is added. Thus, along with the definition of the centre in $X$- ($x_0$) and $Y$- ($y_0$) coordinates, two width parameters are needed - one for the minor axis ($W_p$) and one for the major axis ($W_q$). Further, parameters $\theta$ and $\gamma$ are used to determine the orientation of the sinusoid and the Gabor envelopes, respectively. This yields for 2D-Gabor functions
\begin{equation}
G(x,y) = A_o + A \cdot \exp\left(\frac{-p^2}{2W_p^2}\right) \cdot \exp\left(\frac{-q^2}{2W_q^2}\right) \cdot \cos(2\pi fu + \phi),
\end{equation}
with
\[
p = (x-x_0)\cos \gamma + (y-y_0)\sin \gamma,
\]
\[
q = -(x-x_0)\sin \gamma + (y-y_0)\cos \gamma,
\]
and
\[
u = (x-x_0)\cos \theta + (y-y_0)\sin \theta.
\]
These functions match the definitions as given in \citet{anzai1999} or \citet{prince2002a}. Preliminary experiments have shown that allowing $\gamma$ to vary freely only marginally improves fits of RFs over fits in which $\gamma$ was set to equal $\theta$. Therefore I simplify the 2D-Gabors by postulating that $\gamma = \theta$, which additionally has computational advantages. Correspondingly, I will only write $\theta$ below.
For an example of 1D- and 2D-Gabor functions see figure \ref{fig:gabor}. To gain a better understanding of the influences of each parameter in 1D-Gabors I have written a Matlab demo, which is a graphical interface allowing for interactive manipulations of all parameters of a 1D-Gabor function.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=\textwidth]{fig/gabor.pdf}
\caps{Gabor functions. \textbf{A} 2D-Gabor function. The function is the product of two Gaussian envelopes and a sinusoidal component. The envelopes determine the width of the function along and orthogonal to the orientation which is determined by $\theta$ (here $\theta = -\pi/7$). Centre lies at $(x_0,y_0)$. \textbf{B} 1D-profile of the function shown in A along the line orthogonal to the orientation. The result is a 1D-Gabor function which is the product of the minor Gaussian envelope with the sinusoidal component. Width of the Gaussian is determined by $W_p$ and amplitude by $A$. A small amplitude offset $A_o$ has been introduced to illustrate that parameter. Phase in this example is $\phi = \pi/2$}
\label{fig:gabor}
\end{center}
\end{figure}
\subsubsection{Nonlinear least squares and Levenberg-Marquardt}
In the nonlinear least squares method the difference between fitted curve and data is minimised. This difference is defined by the $\chi^2$ function, which depends on the values of the parameters $\boldsymbol{p} = (p_1, \dots, p_k)$
\[
% \chi^2(\boldsymbol{p}) = \sum_{i=1}^N \left(\frac{d_i - y(x_i;\boldsymbol{p})}{\sigma_i}\right)^2,
\chi^2(\boldsymbol{p}) = \sum_{i=1}^N \left(d_i - y(x_i;\boldsymbol{p})\right)^2,
\]
where $N$ is the number of data points, $d_i$ is the data value at point $x_i$ and $y$ is the value of the model with parameters $\boldsymbol{p}$ at this point. Usually the addends of the $\chi^2$ function are weighted by the variance of the data points such that poorly approximated points do not influence the fitting procedure as much as confident points do. However I omit the weighting term, because the implementation of the fitting algorithm mentioned below does not support weights\footnote{this is a problem I did not solve, but first comparisons with fits of another program, which supports weighting, revealed no crucial differences}, or variances of data points are not given (RF and SVD fits).
I use the popular Levenberg-Marquardt method \citep{press1992} to find a minimum of the $\chi^2$ function. This algorithm is implemented in the MATLAB$^\mathrm{\textregistered}$ function \verb|lsqcurvefit|, which also calculates values of the $\chi^2$ function itself. Thus the crucial work in fitting Gabors lies in properly defining the function to fit and in finding good starting values for the optimisation.
According to whether I want to fit one dimensional or two dimensional data I apply the Gabor definitions as given above.%, but in the case of one dimensional tuning data obtained with RDS stimuli (section \ref{sec:rds}) from which I take the square root in a normalising step, I also fit the square root of the 1D-Gabor function.
\subsubsection{Finding good starting values}
Good starting values are essential for the optimisation to converge to a local minimum. This becomes more important with more complex functions like the 2D-Gabor or the square root 1D-Gabor. Therefore I developed heuristics to estimate the values of critical parameters before optimisation. Phase, position and width of the Gabor functions turned out to be noncritical in most cases, that is, even with poor estimators of these parameters the optimisation tends to converge as long as the other estimates are good.
In the 1D case I determine the frequency of the underlying Gabor-function with a discrete Fourier transform of the data by setting this frequency to the one with the most power in the resulting power spectrum. This estimate is so good that further optimisation of frequency is not needed. Consequently I fix this parameter to the estimated value during optimisation. In addition this procedure prevents that changes in the Gaussian component are cancelled by a change in frequency \citep[see][for an example]{prince2002a}. The heuristic for the amplitude of the 1D-Gabor depends most importantly on the standard deviation of the data and the heuristic for the amplitude offset on the mean, or, in combination with tuning curves, on the values of the first and last data points, because these points estimate the activity to uncorrelated stimuli in the two eyes, which should be the baseline of the tuning curve.
When finding starting values for 2D-Gabor functions I rely on scripts which have been written before for estimating the centre of the 2D data (receptive fields) and the frequency. Again the frequency is fixed to a value determined with fast Fourier transformation. Additionally I fix the amplitude offset to $0$ during optimisation, which is the mean value of the RFs. The amplitude is estimated by the maximum value of the data and orientation is found by correlating differently oriented, sinusoidal gratings with the RFs. The orientation, whose grating correlates best with the RF, is taken as an estimator.
In the case of binocular data I use the optimal parameters found for the left data as starting values for the right data. This ensures that both fits obtain comparable parameters.
\subsubsection{Goodness of fit}
I have three criteria, which together specify goodness of a fit. First, the optimisation must converge to a solution. Second, I calculate the $R^2$ value of the fit. And third, I determine confidence intervals for the fitted parameters.
Most of the time optimisation finds a minimum of the $\chi^2$ function with starting values as described above. If this is not the case, I refit the data with starting values, which are carefully chosen by hand. Thus all fits reported here are the result of a converged optimisation process.
The $R^2$ value gives an estimate for how much of the data variance is explained by the fit. Here I use a degree of freedom adjusted definition of $R^2$ in which the sum of squared errors (SSE) of the fit is related to the sum of squares around the mean (SST) of the data. If the SSE is small compared to the SST, then the fit is considered good and hence explains lots of the variance of the data. The following definition gives 1, if 100\% of the variance is explained
\[
R^2 = 1-\frac{(n-1)\mathrm{SSE}}{(n-m)\mathrm{SST}} = 1 - \frac{(n-1)\sum_{i=1}^n w_i (d_i - y_i)^2}{(n-m)\sum_{i=1}^n w_i (d_i - \bar{d})^2},
\]
where $n$ is the number of data points, $m$ is the number of fitted parameters, $w_i=1/\sigma_i^2$ weights each addend with the variance of the data point, $d_i$ is the data value, $\bar{d}$ is the mean over all data points and $y_i$ is the value of the fitted curve.
Confidence intervals for the fitted parameters are estimated as given in \citet{press1992}. For this I evaluate how the first partial derivatives of the fitted function with respect to different parameters interact over all data points, what leads to the covariance matrix of the fitted parameters. I then extract standard errors for the parameters from there and determine 95\% confidence intervals as the region of two standard errors around the parameter value. Unfortunately, these confidence intervals are quite large in comparison to the fitted values. For example, the maximum of the absolute value of fitted positions over all 100 (1D) tuning curves is $4.8$ whereas the minimum of $2\cdot$standard error is $6.2$. This makes clear that almost all fitted positions lie within every single confidence interval and renders these intervals seemingly useless. To overcome this flaw I included second partial derivatives in my calculations, which should have increased the accuracy of the estimated confidence intervals. Although this increases the range of observed interval sizes (minimum is now circa 0.8), many confidence intervals remain large (median 7.3). So, for all fits, except those where a 1D Gabor function with fixed frequency is fitted, I only consider first partial derivatives, because the gain of second partial derivatives does not match the increase in complexity. Nevertheless, the size of the confidence intervals still gives a good qualitative description of the accuracy of the estimated parameter values, which means that a fit with comparably large confidence intervals often also has questionable parameter values. This applies especially to tuning curves with very low frequency for which it is hard to determine the correct position (see figure \ref{fig:rds}C for an example).
\subsection{Testing RFs with random dot stereograms}
\label{sec:rds}
\citet{prince2002,prince2002a,read2003} and \citet{cumming2002} use random-dot stereograms (RDS) to test neurons in V1 for disparity tuning. In order to compare my artificial cells with their findings I do the same.
Random-dot stereograms consist of bright and dark dots at random positions on a grey background. So they allow testing for disparity tuning independent of orientation, spatial frequency or exact position of the stimulus, because they contain complete spectra of these parameters and are correspondingly called broadband stimuli.
\subsubsection{Generating RDS}
To produce RDS I first generate one large random-dot stimulus, which contains an equal number of white (value $=1$) and black (value $=-1$) dots placed at randomly drawn positions. All other parts of the stimulus get a value of 0. In my computations I use the smallest possible dot sizes of 1 pixel, but dot sizes may take larger values as long as these are expressed in multiples of one pixel. Thereby it is possible that one dot covers another dot partially or fully. The actual number of generated dots is controlled by a given dot density, which expresses in \% how much of the RDS should be covered with dots. For different comparisons with reported data I use either a dot density of 25\% or 50\%, but results of both are very similar. The final step in producing a random-dot stimulus is then the subtraction of the mean from all values such that the mean of the stimulus is 0.\footnote{this is only needed in the case that, e.g. through covering, more bright or dark dots exist}
From the large random-dot stimulus I extract the RDS according to a given disparity. Thereto I define one region of the large stimulus as a reference (right) and get the other region (left) by moving some value, which is determined by the given disparity, away from the reference. As long as these regions have some overlap they represent some disparity, otherwise they are uncorrelated.
I use two forms of disparity. In the horizontal disparity condition I move the left region only on the $X$-axis. This is depicted for one RDS in figure \ref{fig:rds}B. There a RDS with -3 pixel disparity is shown, which represents a near stimulus. Alternatively one can define disparity orthogonal to the RF orientation of the cell. In this case I shift the left region along a line orthogonal to this orientation instead only along the abscissa. I define orthogonal disparity such that if the orthogonal line is horizontal, then a negative disparity would result in a left shift of the left region and therefore would represent a near stimulus, exactly as it is the case in the horizontal condition.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/rds.pdf}
\caps{Activity of cell 78 to random-dot stereograms (RDS) with different disparities \textbf{A} RFs of cell 78 with orientation slightly less than $\pi/2$ (orientation defined as shown in fig. \ref{fig:gabor}A). \textbf{B} Example of one RDS with 25\% dot density and horizontal disparity of size -3 (pixels), which corresponds to a crossed disparity (near) and a shift of the left stimulus 3 pixels to the left with respect to the right stimulus. A disparity orthogonal to the RF orientation would mean that the left random-dot stimulus would be shifted 3 pixels up in this case. \textbf{C} Mean activities to RDS with different horizontal disparities fitted with a 1D-Gabor function. Error bars depict estimated 95\% confidence intervals for the plotted means, red data point shows activity to RDS with disparity as shown in B. Although RF frequency is high, frequency of the tuning curve is very low. This makes it hard to determine the position (dotted line) of the fitted Gabor with high confidence (95\% confidence interval for position $=3.12 \pm 63.11$ px, $R^2=0.97$).}
\label{fig:rds}
\end{center}
\end{figure}
\subsubsection{Activities to RDS}
\label{sec:rdsact}
I "show" these generated RDS, which are of the same size as the RFs (20x20 pixel one eye), to the simulated cells in order to estimate their disparity tuning curves. These tuning curves describe the activity of a cell to certain disparities, while the activity of a cell to a stimulus is defined as in equation \ref{eqn:act} and figure \ref{fig:bem}A. Thereby the probabilistic nature of RDS forces one to average the activities to sufficiently many RDS with the same disparity in order to obtain a meaningful estimate for disparity tuning. In my case 1000 different RDS suffice to get useful data points which can be fitted with a 1D-Gabor function\footnote{Most means are 14-16 times larger than their 95\% confidence intervals (figure \ref{fig:rds}C). The relation between confidence interval of the means, $c$, and number of presented RDS, $r$, is quadratic: $c/n \rightarrow r\cdot n^2$}. Data points of one such tuning curve are mean activities to RDS in all disparities possible with resolution of one pixel, that is disparities $-20,-19,\dots,0,\dots,20$ where $-20$ and $20$ already represent uncorrelated stimuli, because left and right stimulus have no overlap. Additionally, I estimate monocular activities to random-dot stimuli by showing a blank stimulus (all values equal) to one of the eyes and a random-dot stimulus to the other. Hereby the activity to a blank stimulus is always zero.
It is known that the variance of cell activities rises approximately proportional with its mean. \citet{prince2002a} propose to take the square root of all computed activities before further processing to normalise variances, but this transform is not appropriate in my case (see \ref{sec:princeact}). Thus I directly weight data points with their variance in the calculation of statistical measures, which assume equal variances.
To evaluate the tuning for disparity of the artificial cells I use the disparity discrimination index (DDI) \citep{prince2002a}. This measure takes into account the amplitude of disparity tuning and additionally the variance of the tuning curve in order to describe how good a cell can discriminate between its preferred disparities. Therefore it can be defined in terms of maximum mean activity of the cell ($a_{max}$) and corresponding minimum ($a_{min}$) as follows
\begin{equation}
\label{eq:ddi}
\mathrm{DDI} = \frac{a_{max}-a_{min}}{a_{max}-a_{min} + 2\cdot\mathrm{RMS}}
\end{equation}
The root mean squared error (RMS) is the square root of the summed variances around the data points across the tuning curve
\[
\mathrm{RMS} = \sqrt{\frac{\mathrm{SSE}}{r-d}} = \sqrt{\frac{\sum_{i=-20}^{20} \sum_{j=1}^r (a_{ij} - \bar{a}_i)^2}{r-d}}
\]
where $r$ is the number of different RDS used with each disparity (250), $d$ is the number of different disparities (41), $a_{ij}$ is activity to RDS $j$ with disparity $i$ and $\bar{a}_i$ is the mean activity to RDS with disparity $i$. \citet{prince2002a} performed all of these calculations on the square root of activities, that is, on more or less normalised variances. A corresponding operation on my data would be to weight the addends of the SSE with the variance of $\bar{a}_i$, but this eliminates the influence of all variances on the DDI nearly completely, what contradicts the idea of the DDI. Thus the SSE remains unweighted.
Monocular activities ($a_L,a_R$) are used to compute the ocular dominance index
\begin{equation}
\label{eq:odi}
\mathrm{ODI} = \frac{a_L}{a_L+a_R}
\end{equation}
where in physiology monocular activities are taken relative to the recording site, that means instead of left and right one would take ipsilateral and contralateral eye. Cells with ODI near 0.5 are well balanced between left and right eye and hence are said to be binocular.
% to estimate confidence intervals for the means of activity or data points, respectively: $V(\bar{a}) = V(a)/r \Rightarrow \sigma_{\bar{a}} = \sigma_{a}/\sqrt(r)$ (from Efron and Tibshirami "An Introduction to the Bootstrap")
\subsection[Disparities from SVD components of interaction profiles]{Obtaining disparities from SVD components of binocular interaction profiles of RFs}
It is not possible to directly measure the activity of subunits underlying the response of complex cells. To get an approximation of the characteristics of the subunits anyway, \citet{anzai1999b} calculated SVD (singular value decomposition) components of binocular interaction profiles and argued that these resemble well binocular interaction profiles of hypothetical subunits. For a direct comparison of their data with mine I retrace their calculations with my cells.
\subsubsection{Binocular interaction profiles}
\label{sec:iprf}
\citet{anzai1999b} measure binocular interaction RFs with spatiotemporal white noise generated according to binary m-sequences. The resulting profiles can be interpreted as the mean response of a cell to two bars with optimal spatial frequency and orientation, which are presented in different positions to each eye (in each eye one bar). They further define the interaction profile to be the difference of responses to bars with equal contrast in both eyes (match) and responses to bars with opposite contrast in the two eyes (mismatch) \citep[see also][]{ohzawa1997}.
I can calculate corresponding interaction profiles directly from the 2D-Gabor fits of the simulated RFs. Thereto I assume that a presented bar has either value 1 (bright) or -1 (dark). Then the activity to a bar with optimal orientation can be computed according to the cell model from the projection of the two dimensional fit to a one dimensional line orthogonal to the preferred orientation at the centre of the fit. Such a projection is already illustrated in figure \ref{fig:gabor}. The value of this 1D-profile at the position of the hypothetical bar in the left eye and in the right eye corresponds here to $W^lI^l$ and $W^rI^r$, respectively, from equation \ref{eqn:act} (definition of activity), which is then further processed according to the cell model to yield the activity of the cell to matching contrast bars at position $X_L$ in the left eye and position $X_R$ in the right eye. Thereby these positions are defined relative to the centre of the 2D-Gabor fit and $X_L=0$ means that the bar in the left eye is presented in the centre of the RF. If the centre of a fit lies more than 2.2 pixel away from the centre of the patch\footnote{the patch containing the 2D-Gabor has a size of 20x20 pixel, its centre in terms of the Gabor function is (10.5,10.5)}, then the centre of the patch is chosen as reference instead, because otherwise no complete interaction profile could be produced. The value 2.2 is selected such that this reset of RF centres only happens to 2D-Gabor fits with very low frequencies for which it is hard to determine the correct centre, anyway. In this way I achieve a range of positions from -7.3 to 7.3 pixel. At the end of the process I again get 2D-profiles, this time in the dimensions of positions of bars in the RFs.
The only difference in the calculation of activity to non-matching contrast bars is that the 1D profile of one eye is inverted\footnote{one bar has value -1, so it inverts the weights e.g. in $W^rI^r$}. Thus instead of summing the weighted sum of inputs from both eyes in the activity equation, I subtract one weighted sum from the other in the mismatch condition.
The binocular interaction profile is then produced as defined by \citet{anzai1999b} by subtracting non-matching contrast profile from matching profile, which is illustrated in figure \ref{fig:iprf}.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=\textwidth]{fig/iPrf.pdf}
\caps{Calculation of binocular interaction profile from interaction profiles of bars with matching contrast and bars with non-matching contrast on the example of cell 1. $X_L$ and $X_R$ describe the position of left and right bar with respect to the RF centre and range between -7.3 and 7.3 pixel. Colours represent activity to a given configuration of bars. The colourmap is scaled according to the data. Thus in the match and mismatch profiles blue represents 0 and in the interaction profile blue corresponds to the negative value given to red and green is 0. Red has an equal interpretation in all three plots.}
\label{fig:iprf}
\end{center}
\end{figure}
\subsubsection{Calculating disparities from SVD components}
\label{sec:defdisp}
Subsequently, a singular value decomposition (SVD) is conducted by using the built-in MATLAB$^\mathrm{\textregistered}$ function \verb|svd|. The SVD finds components of the original binocular interaction profiles, which are mutually uncorrelated and weights them according to the influence each component has on the original profiles, or, put another way, weights them according to the amount of variance of the original profile they account for \citep[see e.g.][]{press1992}. As a result, the original profiles can be described as a linear sum of their SVD components.
In my case the SVD is mathematically equivalent to a principal component analysis (PCA), if one interprets the binocular interaction profiles as the covariance matrices needed for the PCA \citep{anzai1999b}. Thus, the SVD on binocular interaction profiles could be limitedly circumscribed as a PCA on bar positions within the RFs.
The outputs of the SVD actually are left ($l$) and right ($r$) 1D-profiles for each SVD component along with its weight $w$. Thus the first component of a binocular interaction profile can be computed by
\[
b_1 = l_1\cdot w_1 \cdot r_1^t, \qquad \mathrm{where}\; r_1^t\; \mathrm{is\; the\; transpose\; of}\; r_1.
\]
The monocular profiles are fitted with 1D-Gabor functions, which are used to determine position and phase disparities of the first SVD component. While phase disparity is computed by simply subtracting left profile phase from right profile phase, I use a reference cell method for the calculation of position disparity in correspondence with \citet{anzai1999b}. Thereby the second component is taken as a reference and is assumed to have no position disparity. In order to comply to the conventions introduced by the definition of phase disparity I here subtract right position from left position, which means that a positive disparity corresponds to far stimuli in both disparities.
% But as the order of $l$ and $r$ is chosen arbitrarily in the equation above, far and near disparities could be exchanged in this analysis. Actually it is not chosen arbitrarily, but in a way such that it fits with the original interaction profiles. Important is that left positions correspond to values in the rows of the interaction profiles.
The resulting calculation for position disparity describes a relative position disparity and is as follows
\[
\Delta x = \Delta x_1 - \Delta x_2, \qquad \Delta x_i = x_i^l - x_i^r,
\]
where $x_i^l$ and $x_i^r$ are left and right Gabor positions of the $i_\mathrm{th}$ component. So, with $\Delta x$ and $\Delta x_i$ I obtain distributions for relative as well as absolute position disparities of first and second SVD components.
For completeness I here also state how I compute disparities directly from 2D-Gabor fits of RFs. Phase disparity is calculated exactly as described above. I define the horizontal position disparity of one subunit, $\Delta x_i$, to be the distance between two lines running through left and right RF centres, respectively, with a slope corresponding to the orientation of the RF. The sign of $\Delta x_i$ is determined such that it corresponds to $x_i^l - x_i^r$. $\Delta x$ then mathematically matches the relative position disparity perpendicular to RF orientation given in \citet{anzai1999}.
\newpage
\section{Evaluation of simulation results}
Many contemporary publications to the topic of binocular cells in striate cortex discuss the importance of phase versus position encoding of visual disparity. So this is also the prevalent ground for my comparisons of optimally stable cells and real neurons. Here I contrast the simulation with mainly two physiological data sets, which examine binocular properties of neurons in V1 on different levels of analysis. The data from \citet{anzai1999b} describes binocular properties of functional, complex cell subunits by evaluating binocular interaction profiles and the data from \citet{prince2002,prince2002a} gives measures for disparity tuning of striatal cells obtained with RDS. Especially, it has to be noted that \citet{prince2002,prince2002a} examine solely responses to horizontal disparity, while \citet{anzai1999b} investigate responses to disparity orthogonal to RF orientation. But before I go on to these data sets I will give a basic analysis of my subunit RF properties and will roughly compare the results of this analysis with reported properties of simple cells. All my comparisons suggest that a stability criterion can explain many properties of binocular complex cells, but I also find clear differences between simulation and physiology, which are partly due to the used cell model. These differences are discussed in section \ref{sec:discussion}. %, which require further investigations and changes in cell model or optimisation
\subsection{Subunit RFs and simple cells}
\label{sec:subunit}
Our cell model allows for infinitely many subunit RF shapes, but \citet{ohzawa1990,ohzawa1997} point out that subunits should exhibit simple cell-like RFs, which stand in quadrature phase relationship, in order for the cell to act as a complex cell. Here I report that stable subunit RFs correspond well to the proposed binocular energy model, although they also show deviations in direct comparison with properties of simple cells.
\subsubsection{Basic properties}
\citet{koerding2004} already showed that optimally stable subunit RFs obtained with monocular natural images exhibit properties, which would be expected, if an energy model is assumed to underlie complex cell activities. These properties include selectivity for stimulus position, orientation and spatial frequency as well as a phase shift between subunits of about 90$^\circ$ (quadrature phase relationship of subunits). I can confirm these findings for the extension of the simulation to binocular natural images.
All of my subunit RFs are very well described by a 2D-Gabor function ($R^2$ of fit $> 0.9$ for all RFs, $R^2 > 0.97$ for 70\% of all RFs) and thus they are selective for local contrast, spatial frequency, orientation and position of a stimulus. The RFs are depicted in figure \ref{fig:rfs}. There it can be seen that the Gabor wavelets mostly extent over the whole 20x20 pixel sized RF patches. This is in contrast to simulations with a sparseness objective in which RFs are more localised and do not extent over a whole patch \citep{goldbach2004}. Nevertheless my RFs still resemble simple cell RFs, which have been reported to be well described by Gabor functions \citep{chalupa2003}.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=320pt]{fig/RFs.pdf}
\caps{Stable subunit RFs obtained with binocular natural visual images. Depicted are all 100 simulated neurons, each with four RFs: subunit one left and right and subunit two left and right (in this order). Values of weights (RFs) are mapped to greyscale: bright means inputs are weighted positively, dark corresponds to negative weights. All RFs can be described by Gabor wavelets ($R^2$ of fit $> 0.9$ for all RFs).}
\label{fig:rfs}
\end{center}
\end{figure}
Cells with our cell model become translation invariant, which is an important property of complex cells, if subunits stand in quadrature phase relationship. Figure \ref{fig:quad} illustrates the distribution of phase differences between subunit RFs, which have been calculated from the corresponding parameter of the Gabor fits. It exhibits a clear peak at 90$^\circ$ and therefore indicates that most of the simulated cells hold this key component of the energy model.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=250pt]{fig/quad_phase.pdf}
\caps{Distribution of absolute phase differences between first and second subunits in degree phase angle (PA). Dark green shows difference for left RFs of all cells and light green shows difference for right RFs. Most cells have subunits, which stand in quadrature phase relationship.}
\label{fig:quad}
\end{center}
\end{figure}
Apart from this apparent phase difference between subunits other basic properties do not differ between subunits, exactly as it is predicted by the energy model. So I find strong correlations between orientations, spatial frequencies and RF centres of the two subunits, which is shown in figure \ref{fig:basicequal}.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/basic_equal.pdf}
\caps{Comparison of basic RF properties between first and second subunits. For orientation and x- and y-value of the RF centre [$x_0,y_0$], values of the left RFs of all cells are shown. Scatter plots for right RFs are similar and approximately equal for orientation, respectively. Spatial frequency is the mean of left and right RF spatial frequencies, but left and right RF frequencies are almost always equal, too. $x_0$ and $y_0$ are scattered more widely, because it is hard for the fitting procedure to determine these parameters correctly, which is especially true for low frequency RFs where RF centres are sometimes estimated to lie at the border of the RF patch with a value of 20 (centre of RF patch is [10.5,10.5]); RFs with low frequencies ($f<0.06$ c/px, $28/100$ RFs) depicted as $\scriptscriptstyle ^\bigcirc$. When badly estimated centre values are taken out, I find a significant correlation between subunits for the RF centre as well ($x_0: r=0.86, P<0.001;\; y_0: r=0.92, P<0.001$, low frequencies excluded).}
\label{fig:basicequal}
\end{center}
\end{figure}
But figure \ref{fig:basicequal}A additionally illustrates one clear difference between simulated RFs and those of real cells, which will also hold for the later analysis of complex cell response properties. While the population of striatal cells is in general tuned to all kinds of orientations, simulated cells mostly lack a tuning to orientations between 45$^\circ$ and 135$^\circ$ except for plain vertical orientations around 90$^\circ$.
\subsubsection{Binocular properties}
\label{sec:rfbin}
As seen above subunit RFs have basic properties similar to simple cells. Although the binocular energy model does not assume that subunits represent actual simple cells it is an interesting further question whether binocular properties of subunits and those of simple cells are comparable. Therefore I calculated disparities within subunits from RF fits as described in section \ref{sec:defdisp} and contrast them to data reported in \citet{anzai1999}. I exclude two cells (52,74) from the simulation data, because their position disparity exceeds 10 pixel, what indicates that the estimate for position of the centre of left or right RF and so the whole fit is implausible.
The histograms plotted in figure \ref{fig:dispsimp} suggest that the distributions of disparities in subunits and simple cells have some critical features in common, but also reveal one main difference between simulation and physiology. Portrayed are counts of simulated and simple cells with certain phase and position disparities. Plots in figure \ref{fig:dispsimp}A, C and E show distributions of first subunit phase disparity in degree phase angle (PA), phase disparity in stimulus space in pixel and position disparity in pixel, respectively, whereas figure \ref{fig:dispsimp}B, D and F are their biological counterparts. I can not directly compare the ranges of disparities encoded, because the physiological data is given in degree visual angle (VA) and it is unclear how to convert this to pixel as long as biological RF sizes are unknown or differ widely in the reported data. Nevertheless statements about the form of disparity distributions can be made. So it can be seen that both disparities in stimulus space of the simulation are Gaussian-like with mean around zero as it is the case for simple cells. Furthermore \citet{anzai1999} report that the standard deviation of phase disparity in VA is about 1.6 times larger than the standard deviation of position disparity. A statistical analysis of my disparity distributions shows that phase has a larger variance than position ($F$ test: $F$ ratio = 3.32, df = $(97,97)$, $P<0.001$), too, whereby the ratio of phase versus position disparity standard deviations equals $1.82$. So, this is in good correspondence with the physiological data. On the other hand, the histograms for phase disparity in PA uncover that here the distributions in simulation and biology differ. The simulation apparently produces more large phase differences than are found in simple cells. Accordingly, I count 29/98 cells with absolute phase differences larger than 120$^\circ$ PA in the simulation data whereas in physiology there are only circa 10/97.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/dispSimple.pdf}
\caps{Phase and position disparity distributions for simulated and simple cell RFs. \textbf{A,C,E} Histograms for disparities of first subunit RFs. Phase disparity distribution in PA has a peak at 0$^\circ$, but also peaks at the tails at $\pm 180^\circ$. Phase disparity in pixel does not show this phenomenon and has a standard deviation of 3.83 px. Position disparities are centred at 0 px and have a smaller standard deviation of 3.74 px for relative and 2.1 px for true position disparities. \textbf{B,D,F} Histograms for disparity distributions of simple cells adapted from \citet{anzai1999}. All distributions are centred around 0. More than 80\% of all phase differences in PA lie within $\pm 90^\circ$. Standard deviation of phase disparities in VA is 0.59$^\circ$. Position disparities are relative position disparities determined with a reference cell method. Their standard deviation is 0.52$^\circ$. The estimate for standard deviation of true position disparities is 0.37$^\circ$ (given that position disparities of cell and reference cell are independent).}
\label{fig:dispsimp}
\end{center}
\end{figure}
Additional comparisons between subunits and simple cells underline the discovered relationships. Consequentially, like \citet{anzai1999}, I do not find a correlation between position and phase disparities in pixel ($r=0.11, R^2<0.01\%$). Furthermore an assessment of disparities in relation with RF orientation reveals similar properties, too. In physiology it has been described that RFs with horizontal orientations ($0^\circ \pm 20^\circ$) encode a smaller range of phase disparities than RFs with vertical orientations ($90^\circ \pm 20^\circ$) (\citet{anzai1999}: $F$ ratio = 2.94, df = $(18,25)$, $P<0.01$). This is analogous to simulation data ($F$ ratio = 2.15, df = $(30,34)$, $P<0.05$), although here the variance seems to increase faster from horizontal to vertical orientations, which is also expressed in the greater $P$-value.
\\
\\
The data presented here implies that most optimally stable cells building on the used cell model are a good realisation of the binocular energy model. In addition I find major similarities between subunit RFs and those of simple cells, which might be particularly interesting from the perspective that already \citet{hubel1962} proposed that complex cell responses could be constituted on simple cells as their input. Nevertheless not all properties of simple cells are reflected in the population of subunit RFs. Differences in the distributions of orientations and phase disparities in degree PA are my main corresponding findings here.
%\subsection{Complex cell properties from binocular interaction profiles}
\subsection{Comparison to complex cell data from Anzai et al.}
\label{sec:anzai}
Subunit RFs of real neurons can not be determined directly, but have to be estimated from the top-level output of the complex cells. Here I imitate the method from \citet{anzai1999b} for the estimation of binocular subunit RF properties from responses of cat complex cells and compare mine with their results. I notice that this method does not recover interaction profiles of the original subunits, but rather computes functional subunits with resembling properties. Nonetheless, some of the findings from the direct RF analysis are represented in the binocular interaction profiles as well. Correspondingly I again find several similarities between optimally stable cells and real neurons and as major differences I report overrepresented tails in the distribution of phase encoding in simulation data and a dependence of disparity magnitudes on preferred orientation of simulated cells.
\subsubsection[SVD components as subunits]{SVD components of binocular interaction profiles as subunits}
Binocular interaction profiles represent the different responses of a cell to stimuli shown at different positions in the two eyes (see methods \ref{sec:iprf}). If these profiles exhibit elongated regions of uniform activation along the front-parallel axis where the position difference of left and right stimuli are equal (axis of equal disparity), then the corresponding cell is disparity selective. Except for two cells (25,76, very low spatial frequency) all of the simulated cells show clear elongated regions along the frontoparallel axis (example depicted in figure \ref{fig:iprf}) similar to the description of disparity selective complex cells in \citet{anzai1999b}.
By calculating quadrature (mutually uncorrelated) components of the binocular interaction RFs with singular value decomposition, \citet{anzai1999b} assume to obtain interaction profiles of underlying functional, complex cell subunits. They report that in more than 50\% of the examined cells the SVD gives evidence for the existence of only two components, which significantly contribute to the binocular interaction profile of the cells. Thereby a key prediction of the energy model is confirmed. Nevertheless they also find neurons in which the interaction profile can be decomposed into three or four major components. In contrast, the SVD performed on simulated interaction profiles does not identify cells with more than two key components as expected for a two subunit model. Nonetheless, the population averages of the data variances to which the different components contribute (component weights) are quite similar in both data sets (see figure \ref{fig:compvar}). Correspondingly the percentage of variance contributed by the first two components is circa 80\% in physiology while it is about 87\% in my simulation.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/compVar.pdf}
\caps{Contribution of components 1 to 16 to the data variance. Each data point represents the average of the population and error bars visualise $\pm$SD. \textbf{A} Only the first two components contribute significantly to the variance in simulation data while taken together these two account for circa 87\% of the variance. ($n=100$) \textbf{B} In real neurons sometimes three or four components need to be considered to explain the data variance. Nevertheless the mean values are similar to those in A and the first two components already account for circa 80\% of the variance. ($n=48$)}
\label{fig:compvar}
\end{center}
\end{figure}
There are a few simulated cells for which the first SVD component already accounts for more than 70\% of the data variance, what indicates that these components are no correct replication of the underlying subunits, because the subunit RFs have usually well balanced amplitudes as can be seen in figure \ref{fig:rfs}. Rather the components should be seen as functional subunits, which are only linear combinations of subunits. Actually, if I compute binocular interaction profiles of the single subunits and compare those with the SVD components, it turns out that these seldom match. Instead the components represent the mean of subunit interaction profiles, or rotated, or shifted versions of them. An example of this behaviour can be seen in figure \ref{fig:svdsub}. Although consequently not the properties of true subunits are compared in the following, the comparison is still of interest, because functional properties of the cells are revealed.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=320pt]{fig/svdsub.pdf}
\caps{Binocular interaction profiles of subunits compared to those of first and second SVD components. If there is a dominant first SVD component, like here (cell 70), this resembles more the mean of the subunit interaction profiles rather than a single subunit. Otherwise the first component has been observed to represent the first subunit, the second subunit, or rotated, or shifted versions of them.}
\label{fig:svdsub}
\end{center}
\end{figure}
\subsubsection{Disparity encoding in component interaction profiles}
The SVD on binocular interaction RFs conveniently gives monocular profiles of the component interaction profiles as output. I then use the monocular profiles to determine the component disparities by fitting 1D-Gabor functions to left and right profile and evaluating their parameters. All Gabor fits account for more than 96\% of the variance of the curves, but I again exclude two cells (27,95) from the analysis, because their position disparities exceed 10 px implying bad fits. Thus, I here compare the simulation data set with size 98 to the one from \citet{anzai1999b} of size 48.
The binocular energy model predicts that apart from a phase difference the subunits share their RF parameters such as preferred spatial frequency, orientation and disparity. \citeauthor{anzai1999b} tested for this relation between first and second SVD components and found good correspondence with the prediction. However, while in simulation most of the parameters more or less agree in first and second components, there is no correlation between the Gabor position parameters as illustrated in figure \ref{fig:poscomps}. Since there is a correlation, when RFs are considered directly, this has to be a by-product of intermediate processing steps. One problem may be that inaccuracies accumulate along the path to monocular profiles and because position parameters are quite small, they are sensitive to such changes.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=300pt]{fig/posComps.pdf}
\caps{Comparison of Gabor position parameters ($x_0$) in first and second SVD components. \textbf{A} There is no correlation between $x_0$ in components of simulation. Black circles represent right profiles and light-coloured circles left. Data points lying outside the shown region are plotted on the nearest border. \textbf{B} \citet{anzai1999b} find good agreement between the first two SVD components.}
\label{fig:poscomps}
\end{center}
\end{figure}
If it is true that estimates for Gabor position parameters of the monocular profiles are very noisy, then one would expect that the estimated distribution of position parameters is wider than the real one. \citet{anzai1999b} report that position disparity of the first SVD component is limited to a relatively small extent compared to phase disparity. While they find that the standard deviation of phase disparities in pixel is about three times larger than the standard deviation of relative position disparities (SD ratio: 3.05), the distribution of phase disparities is only marginally wider than the distribution of relative position disparities in simulation data (SD ratio: 1.22) although I observe more large phases in PA when contrasted to physiology (histograms depicted in figure \ref{fig:dispcomp}). Accordingly, the expected effect of noisy position estimates might have introduced this discrepancy. Thus, there would only be a marked divergence between the disparity distributions in simulated and real cells in the distribution of phase in PA suggesting that there are more cells in the simulation, which have monocular interaction profiles with different shapes.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/dispComp.pdf}
\caps{Phase and position disparity distributions of first SVD component for simulated and simple cells. (figure \ref{fig:dispsimp} for same comparison on subunit RF level) \textbf{A,B} The distribution of phase disparity in PA is much broader in the simulation. Especially phases near $\pm180^\circ$ are more frequent. \textbf{C,D,E,F} Position disparity is considerably denser compared to phase disparity in physiology (SD ratio: 3.05, relative position; 4.27, estimated true position). In simulated cells this is similar, but not as strong (SD ratio: 1.22, relative; 2.19 true position), which might be due to inaccuracies in the calculation.}
\label{fig:dispcomp}
\end{center}
\end{figure}
For simple cells it has been reported that neurons with horizontal orientations predominantly encode small disparities which is thought to reflect that there are greater horizontal than vertical disparities in natural stimuli, because of the lateral displacement of the two eyes. However, this dependence of disparity on preferred stimulus orientation seems not to be existent for SVD components of complex cell interaction profiles. In figure \ref{fig:disporif} magnitudes of relative position and phase disparities are plotted against RF orientation. In both, simulation and data from \citet{anzai1999b}, cells with low orientations also encode a large range of phase disparities in PA. Nevertheless in simulated cells with horizontal orientations a slight tendency towards a smaller range of disparities can be observed, which is clearer with disparities in px (figure \ref{fig:disporif}C) and there even significant on a $F$ test between the variance of phase disparities in cells with orientations $\pm30^\circ$ from horizontal and the variance of phase disparities in cells with orientation $\pm30^\circ$ from vertical ($F$ ratio: 1.93, df=(30,52), $P< 0.05$). Therefore, the SVD components in the simulation reflect the postulated property of natural stimuli that large vertical disparities are not as frequent as horizontal ones, whereas this is apparently not represented in components of complex cells.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/dispOriF.pdf}
\caps{Relation of first SVD component disparities to RF orientation and spatial frequency. \textbf{A,B} In simulated and real cells the scatter of phase disparities in PA is large for all RF orientations. \textbf{C,D} While magnitudes of phase disparity in pixel are more or less evenly distributed in data from real neurons with horizontal or vertical orientation, the variance of phase disparities is smaller for simulated cells with horizontal orientation ($0\pm 30^\circ$) than for cells with vertical orientation ($90\pm 30^\circ$) ($F$ ratio: 1.93, df=(30,52), $P< 0.05$) \textbf{E,F} A size-disparity correlation is found in both data sets. Solid line represents upper limit for phase disparities (180$^\circ$ PA), dashed line corresponds to 90$^\circ$ phase in PA. Biological data from \citet{anzai1999b}. Position disparities are relative position disparities.}
\label{fig:disporif}
\end{center}
\end{figure}
Figure \ref{fig:disporif}E,F illustrates the relationship between disparity of SVD components and RF spatial frequency. It has been proposed that a size-disparity correlation in which the range of disparities encoded is limited dependent on the periodicity of the RFs is advantageous for determining the correct depth of a stimulus \citet{marr1979}. Such a correlation is expected when phase is the dominant encoding principle, because the maximum phase disparity in pixel ($\pm 180^\circ$ PA) directly depends on RF spatial frequency. As seen in the figure, phase disparities in pixel are widely scattered under the maximum phase limit in simulation and physiology though the scatter in physiology is not as evenly distributed as in simulation. Nonetheless, a clear dependence of phase disparity in pixel on RF spatial frequency can be observed in both data sets (regression analysis in data from \citeauthor{anzai1999b}: slope$=-1.99$, $P<0.01$) while there is no such dependence between phase disparity in PA and RF spatial frequency (not shown). Furthermore, \citet{anzai1999b} find that position disparities are generally small and their range does not depend on RF spatial frequency (regression slope = -0.3, P = 0.07). In simulated cells, on the other hand, position disparities reach quite high values and it seems that they do depend on frequency as well, but it is unclear whether this reflects real properties of cells or is just a product of the above mentioned noise.
\\
\\
All in all the analysis of binocular interaction profiles reveals many similar properties of optimally stable and binocular complex cells. Both exhibit elongated regions along the frontoparallel axis within the interaction profiles suggesting sensitivity to disparity independent of stimulus position in the RF. Although a SVD on binocular interaction profiles does not recover the original subunit interaction profiles, the resulting components can be used to describe functional properties of the cells. Thereby the number of components found in physiology largely supports the use of a two-subunit energy model. Furthermore, the distributions of the different disparities in simulation and physiology show similar tendencies, which is also true for their relation to RF parameters spatial frequency and orientation. However, the analysis again yields that there are a lot more cells with high magnitude phase disparities (PA) in simulation than in biology. Additionally, simulated cells seem to be specialised for horizontal disparities while this is not the case for real complex cells.
%\citet{anzai1999} show that the distribution of relative position disparities is by a factor of $\sqrt{2}$ greater than the distribution of absolute position disparities under the assumption that position disparities of cell and reference cell are independent. Although position disparities of components 1 and 2 are not correlated (r: -0.08, P: 0.46, corrcoef), relative position disparities have more than $\sqrt{2}$ greater variance than absolute position disparities.
%uncorrelated position disparities:
% hypothesis: in comparison to phase, position disparities are negligible small
% test: find cells with approximately equal phase and position disparities, in these cases position disparity has to be relevant, too, and so position disparity should be equal in both components, if cell is tuned to disparity
%Correspondingly position and phase encoding are not correlated while phase encodes a greater range of disparities, which is in good correspondence with the data from \citeauthor{anzai1999b}. On the other hand, the distribution of phase as estimated with interaction profiles tends to have larger tails than reported in physiology, too.
\subsection{Comparison to RDS data from Prince et al.}
\label{sec:prince}
\citeauthor{prince2002} conducted an extensive physiological study of disparity tuning in monkey striate cortex in which they recorded responses to RDS from 787 cells. They quantified sensitivity for horizontal disparity of cells with the DDI \citep{prince2002a} and further examined 180 strongly disparity tuned cells by comparisons of fitted Gabor function parameters \citep{prince2002}. Unfortunately they did not discriminate between simple and complex cells in most of their analyses, but they classified 57 out of 226 cells for which they determined the F1:F0 ratio as simple (25\%) suggesting that a large majority of reported results belong to complex cells. Furthermore they found no correlation between F1:F0 ratio and DDI, what indicates that simple and complex cells have similar disparity sensitivities, which is supported by data from \citet{anzai1999,anzai1999b} where distributions of phase and position parameters of simple and complex cells are similar, too.
I computed responses of my optimally stable cells to RDS with horizontal disparity as described in methods (\ref{sec:rds}) and analysed them analogous to \citeauthor{prince2002}. All simulated cells exhibit statistically significant modulations of their disparity tuning curves. Although response types are largely comparable, my investigations uncover that simulated tuning curves are often wider and variances of data points differ from those of real neurons. Moreover, while in physiology phase and position encode similar ranges of disparity (phase slightly larger), disparity of simulated cells is predominantly determined by the phase component.
\subsubsection{Activities to RDS and disparity sensitivity}
\label{sec:princeact}
\citet{prince2002a} provide data in which they quantify activities of cells in V1 to RDS with horizontal disparity in order to make statements about the cells' strength of disparity tuning. They report that the variance of firing of their cells increases with mean firing rate. This relation is long known for cortical neurons \citep[see][for discussion]{shadlen1998} and does not depend on the stimuli used (as long as they lead to the same mean firing). Variance of artificial activities, on the other hand, is the direct result of variability in the given stimuli, because in the calculation of activity to a certain stimulus no random process is involved. Thus the variance stabilising transform (square root of activities) employed by \citeauthor{prince2002a} deals with a different kind of variance than I have given from my simulated cells. Then again, \citeauthor{prince2002a} apparently define all RDS with the same disparity as one stimulus, which gives me the opportunity to compare their variances of mean firing rate to such a stimulus with my variances of activities to different RDS with equal disparity. Still this comparison should be judged with care, since the variances may result from different sources of variability.
Unfortunately \citeauthor{prince2002a} do not give direct figures of unstabilised mean firing rates, but they report that before their square root normalisation Fisher transformed correlation values of mean firing rates and variances are $0.86 \pm 0.72$ (SD) on average over cells. The corresponding value for the Fisher transformed correlation of artificial activities to RDS with a certain disparity and their variances is $2.32 \pm 0.19$ indicating that this correlation is by far stronger than the one found in biology. Figure \ref{fig:var} illustrates this dependency of variability and strength of response very clearly. There, mean activities to equal disparity RDS are plotted against their variance and standard deviations. It can be seen that the average of activities has approximately a quadratic relation to variance and accordingly a linear one to standard deviation. Coherently I find that the ratio between standard deviation and mean activity is almost constant around a value of 0.53 ($\pm 0.018$). \citeauthor{prince2002a} diminish the reported correlation to a value of $0.21 \pm 0.68$ (Fisher transformed) by the square root transformation of their mean firing rates. The same operation on my activities reduces correlation only marginally ($2.02 \pm 0.27$) and is therefore useless from that perspective. Another effect described by \citeauthor{prince2002a} of this procedure was that it removed a positive skew from the response distributions for single stimuli. I made similar observations. However, since this is consistent over all cells and stimuli and the ratio of standard deviations and means is still approximately constant ($0.28 \pm 0.009$), I do not apply the square root transform to my data.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/var.pdf}
\caps{Variability of artificial activities. \textbf{A} Each data point represents mean activity of one cell to 1000 different RDS with a certain disparity plotted against its variance (41 different disparities $\cdot$ 100 cells = 4100 data points). The line describes the function $y=0.53^2x^2$ where 0.53 is the ratio of standard deviation divided by mean. \textbf{B} Same as A but with standard deviation. Mean and standard deviation have a linear relationship and their ratio is approximately constant at $0.53 \pm 0.018$. Plots with activities which have been scaled by a square root transform are very similar although ratio of standard deviation versus mean is reduced ($0.28 \pm 0.009$).}
\label{fig:var}
\end{center}
\end{figure}
I discussed these matters of variability in activities at the beginning, because the reported variances have strong influence on the DDI, which takes them into account in order to quantitatively express discriminability of maximum and minimum points on the disparity tuning profile. The idea is to rate tuning curves with lower variances better than tuning curves with the same strength of modulation but higher variance. As variability of simulated cell activities is considerably greater than the one found in physiology, it is expected that the values of the DDI are lower. This can be seen in figure \ref{fig:ddi}. Values for the DDI of simulated cells range from 0.03 to 0.16, but still offer a good description of disparity discriminability. So, for example, all very low frequency tuning curves have DDI$<0.06$ independent of their mean activity while tuning curves with higher frequency, that means with more considerable tuning, also get higher DDI values. Additionally \citeauthor{prince2002a} tested their cells for disparity selectivity with a one-way ANOVA on their square root firing rate data and report that 55\% of their V1 neurons show significant modulations at the 5\% level. Although the assumptions of normally with equal variance distributed data are violated by the artificial activities\footnote{The ANOVA test is known to be robust to modest violations of these assumptions.} all 100 of my cells show highly significant ($P \ll 0.01$) modulations with disparity on a one-way ANOVA.
% Furthermore, only 32\% (253/787) of real neurons had an $F_{index}$ of > 0.8 whereas the $F_{index}$ of all my cells is > 0.95.
This puts forward that even though DDI values are low through high variance, all of the simulated cells significantly change their response with disparity, which is backed by the fact that all of my cells are binocular according to the ODI ($\mu = 0.49 \pm 0.03$ SD). Consequently, the DDI values form a continuum from low to high disparity sensitivity, which is similar in biology.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/DDI.pdf}
\caps{Histograms of DDI values of simulated and striatal cells. \textbf{A} DDI values for 100 simulated cells. Maximum is 0.151 and minimum is 0.035. A one-way ANOVA revealed that all cells exhibit significant modulations with disparity ($P\ll 0.01$). \textbf{B} DDI of 787 V1 neurons (data from \citet{prince2002a}). Dark shaded are neurons which have significant disparity tuning on a one-way ANOVA ($P<0.05$). Neither in A nor in B two distinct populations can be seen.}
\label{fig:ddi}
\end{center}
\end{figure}
The DDI has been used to discover relationships between the strength of disparity tuning and other cell properties. First of all \citeauthor{prince2002a} found that the DDI is independent of mean firing rate. In contrast, as variances of simulated activities heavily depend on their means and the DDI in turn depends on these variances it is no surprise that the DDI decreases with increasing mean activities for my data, but this dependency is more moderate than anticipated (Spearman's rank: $r_s=-0.33$, $P<0.001$), which can be traced back to a simultaneous increase of $a_{max}-a_{min}$ from the definition of the DDI, eq. \ref{eq:ddi} ($r_s=0.71$, $P\ll 0.001$). A scatter plot of mean activity versus DDI is depicted in figure \ref{fig:ddicomp}A and B. Further plots in this figure show the DDI plotted against RF spatial frequency and RF orientation. Whereas frequency is not significantly correlated with the DDI ($r_s=-0.09$, NS) similar to physiology, I find a clear correlation between DDI and RF orientation ($r_s=0.69$, $P \ll 0.001$) contrary to physiology. This relation is to some extent predicted by the binocular energy model, which states that the tuning curves of cells with preferred horizontal orientations to stimuli with horizontal disparity will have a low frequency and thus low DDI values will be assigned to them (see above). Correspondingly, the biological finding that orientation is not correlated with the DDI could be explained in two ways. Either Prince' DDI does not rate all low frequency tuning curves low, or the prediction of the energy model fails. As one can see in figure \ref{fig:cycsd} this prediction is fulfilled to some degree by the measured V1 neurons, hence the former must be true, what indicates that ratings of my DDI and Prince' deviate, if certain tuning types are judged.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/DDIcomp.pdf}
\caps{Relationship between DDI and other cell properties in simulation (left) and physiology (right, data from \citet{prince2002a}). Correlation has been tested with Spearman's rank in all cases. \textbf{A,B} DDI versus mean activity. A slight correlation can be found in simulation data ($r_s=-0.33$, $P<0.001$), but not in physiology. This correlation might be an overestimation, since all points over 5000 activity with DDI$<0.05$ represent low frequency tuning curves, which got consistently low DDI values. \textbf{C,D} DDI is neither in simulation ($r_s=-0.09$, NS) nor in real neurons ($r_s=0.07$, NS) significantly correlated with frequency of the preferred grating stimulus. \textbf{E,F} Although preferred orientation of a cell and DDI show no significant correlation in biology ($r_s=0.06$, NS), they do in simulation data ($r_s=0.69$, $P \ll 0.001$) as predicted by the energy model.}
\label{fig:ddicomp}
\end{center}
\end{figure}
\subsubsection[Tuning types described by position and phase]{Tuning types described by position and phase encoding}
Disparity tuning curves are often classified into six different categories according to their shape \citep{poggio1988}. Here I examine tuning curves of simulated neurons for whether they exhibit the reported shapes. Thereto, I have fitted the tuning curves with Gabor functions and compare their parameters with those published by \citet{prince2002}. Furthermore I contrast estimates for position and phase encoding of disparity from this study with disparity encodings obtained from my cells.
All of my tuning curves are well described by 1D-Gabor functions and have a $R^2$ value $>0.93$, but I exclude four cells, because confidence intervals of their fits are complex, which indicates that these fits can not be trusted due to especially large confidence intervals\footnote{this criterion is more or less arbitrary, since there are other cells with large, but non-complex confidence intervals, but I did not want to exclude too many cells with this criterion since \citet{prince2002} have not done this either}. So, a population of 96 cells is left for comparison with \citeauthor{prince2002}'s population of 180 real neurons.
The characteristics of the six tuning curve types are as follows: tuned zero (T0) cells have a tuning curve with a narrow peak at zero disparity, tuned excitatory (TE, near TN and far TF) cells resemble T0 cells, but their peak is shifted to near or far disparities, tuned inhibitory (TI) cells exhibit a trough around zero disparity and near (NE) and far (FA) cells show broad tuning for near and far disparities, respectively while being inhibited at opposite disparities. Therefore phase and position parameters of the fitted Gabor should suffice in order to classify the tuning curves according to this classification scheme. Then tuning curves with phase near zero correspond to T0 or TE types dependent on position while curves with phase around $\pm \pi$ are said to be TI and asymmetrical curves with phase near $\pm \pi/2$ can be classified as NE and FA, respectively. A scatter plot of these two parameters can be seen in figure \ref{fig:types} along with examples of tuning curves of all types from simulation and physiology.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/typesDots.pdf}
\caps{Tuning types in simulation (\textbf{A}) and physiology (\textbf{B}, adapted from \citet{prince2002}). See text for description.}
\label{fig:types}
\end{center}
\end{figure}
\citet{prince2002} found all tuning curve types in their data, but there were no clusters of certain types, rather they report a continuum of tuning curve forms from one type to another. This is comparable when disparity response profiles of simulated cells are considered. However it is hard to find good examples for TE tuning curves. Although there are cells, which are sharply tuned (similar to cell 22, T0) to near and far disparities their tuning curves are more asymmetric than symmetrical around their peak. Furthermore the peak disparity of TE cells is always near zero in my case\footnote{Although points with phase near zero and larger position parameter can be seen in the scatter plot of figure \ref{fig:types}A, I would not classify them as TE, because these are very broad Gaussian-like tuning curves}. This is comprehensible from the fact that disparity of TE cells is encoded by position disparity alone, because tuning curves of TE cells are symmetrical (phase = 0). As I have shown in section \ref{sec:rfbin} that RFs of simulated cells exhibit only small position disparities, it follows that TE cells can only be tuned to disparities near zero.
Such a conclusion is valid, because the finding from RFs directly transfers to parameters of the tuning curves as can be seen in figure \ref{fig:posphase}. There position is plotted against phase, which has been transformed into pixel space. It is clearly recognisable in this figure, that the distribution of phase shifts in pixel is much broader than that of position shifts. While position parameters have a mean of 0.22 px with standard deviation 1.25 px these values for phase are much larger: $0.89 \pm 6.1$ px. Compared to the RF analysis in section \ref{sec:rfbin} this discrepancy has even increased with RDS since there the ratio of the standard deviations has been 1.82 while here it is 4.89, which is most probably the result of the use of horizontal disparities, because in comparison to disparities orthogonal to RF orientation the tuning curves show an increase in spatial scale and as position disparities seem not to have great influence in the encoding of disparity in my cells they even loose importance. However, the most important point made in figure \ref{fig:posphase} is that here an obvious difference between simulation and physiology is exposed. Although \citeauthor{prince2002} report that in their data the range of disparities encoded with phase is slightly larger than that encoded with position (SD ratio: 1.25), this finding is by far not as strong as it is in my data.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=\textwidth]{fig/posPhase.pdf}
\caps{Comparison of phase and position parameters in simulation and physiology. Phase has been transformed into position space by division through $-1 \cdot$ spatial frequency of the tuning curve. \textbf{A} Gabor mean position encodes a much smaller range of disparities than Gabor phase in simulated cells (SD ratio: 4.89). Position and phase are negatively correlated ($r_s=-0.52$, $P \ll 0.01$), that means contributions of both cancel each other, but as phase is generally considerably greater than position, a large range of disparities can still be encoded. \textbf{B} Position and phase parameters encode similar ranges of disparity in data from \citet{prince2002} (SD ratio: 1.25). Only a slight positive correlation is observed ($r_s=0.24$, $P < 0.001$).}
\label{fig:posphase}
\end{center}
\end{figure}
This discrepancy between the ratios of standard deviations in simulation and physiology is fostered through a further difference, which is depicted in figure \ref{fig:phase}. In the transformation of phase in PA to pixel a large phase in PA will produce a large value in pixel. However, for a large phase in PA there might be no straightforward interpretation of a large phase in pixel in terms of position shifts and preferred disparity. Obviously, this problem mainly concerns TI tuning curves, which have the largest phases (near $\pm \pi$), but are only inverted versions of T0 curves and their shift in phase can not be compared to a shift in position with equivalent magnitude. Thus, more large phases will lead to a disproportionate representation of phase in the ratio of standard deviations when preferred disparities are concerned. This is the case when simulated cells and real neurons are compared, since there are more tuning curves with large phase in the simulation data than reported in biology (see figure \ref{fig:phase}). Correspondingly, this is expressed in the percentage of cells classified as TI by a criterion proposed in \citet{prince2002}\footnote{TI: $|phase|>\frac{3}{4}\pi$} where only 16\% of all cells were labelled TI. In contrast, I classify 26\% of the simulated cells as TI according to the same criterion. Nevertheless, I here interpret the ratio of standard deviations as a measure to estimate different contributions of phase and position to the characteristics of the tuning curves and it still shows that the phase parameter has a much greater influence on tuning curves than position in my data.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/phase.pdf}
\caps{Distribution of fitted phases as polar plots. \textbf{A} There is a substantial number of simulated tuning curves with phases near $\pm \pi$. (distance from origin represents number of tuning curves with that phase) \textbf{B} In physiology, on the other hand, large values for phase are rare and phases near zero dominate. Data replotted from \citet{prince2002} (solid line) and includes data from \citet{nieder2000} (dotted line) and \citet{anzai1999,anzai1999b} and \citet{deangelis1991} (dashed line)}
\label{fig:phase}
\end{center}
\end{figure}
A last thing that I note from figure \ref{fig:types} regards the T0 cell there (c22). The tuning curve of this cell exhibits several peaks, which is seldom reported for real neurons. This phenomenon can be quantified as the ratio of the wavelength of the fitted tuning curve ($\lambda = 1/f$) to the standard deviation of the fitted Gabor ($W$) and thus gives an estimate for the number of cycles in one standard deviation. Importantly, for this analysis I prohibited values of $W$ greater than 10.25 px, that means I set $W=10.25$ px for all tuning curves with $W>10.25$ px. The rationale behind this is the following: Gabor functions with $W$ much greater than $10.25$ px allow modulation of the disparity tuning curve behind $\pm 20$ px, what is implausible, because all disparities smaller than -20 px and greater than 20 px represent uncorrelated stimuli in left and right RFs. Therefore the tuning profiles should be flat in these regions. So, if $W>10.25$ px, this estimate has to be wrong and $W=10.25$ px is the next plausible estimate. This non-linear transformation of the standard deviation parameter does not change the subsequent results.
\begin{figure}[htp]
\begin{center}
\includegraphics[width=\textwidth]{fig/cycSd.pdf}
\caps{Number of cycles per standard deviation of the fitted Gabor ($W/\lambda$) as a function of RF orientation. \textbf{A} The minimum of $W/\lambda$ in simulated tuning curves is 0.14 and the maximum is 0.94. In red c22 from figure \ref{fig:types} is shown. \textbf{B} In data from \citet{prince2002a} tuning curves cover a smaller range of $W/\lambda$ and most of them stay below 0.5. Yet, both data sets exhibit a correlation predicted by the energy model of $W/\lambda$ with RF orientation as determined with a $F$ test between variations of cells in the first halfs of the plots (0-45$^\circ$) and second halfs of the plots (45-90$^\circ$) (simulation: $P< 0.001$, physiology: $P<0.005$).}
\label{fig:cycsd}
\end{center}
\end{figure}
Figure \ref{fig:cycsd} depicts this ratio $W/\lambda$ in relation to preferred orientation of the cells. There it can be seen that $W/\lambda$ ranges from approximately 0.15 to 0.9 in simulation while the same quantity ranges only from about 0.05 to 0.5 in real tuning curves. Nonetheless I find in both data sets a dependence of the number of cycles per SD on orientation as expected when the energy model is assumed (see end of last section \ref{sec:princeact}). Moreover, this relationship is significant when a $F$ test is considered, which compares the variances of $W/\lambda$ of cells with orientations smaller than $45^\circ$ from horizontal with this of cells with orientations smaller than $45^\circ$ from vertical in simulation ($F$ ratio: 4.66, df=(33,61), $P< 0.001$) and physiology ($F$ ratio and df not given, $P< 0.005$).
\\
\\
In summary, investigating disparity tuning of simulated cells with random dot stereograms yields a different type of response variances if compared with response variances of neurons in V1. As a result, a measure which takes these variances into account, like the DDI, also differs in the two conditions. Moreover, the distribution of the DDI for optimally stable cells has to be altered with respect to physiology, because in contrast to physiology all optimally stable cells show significant disparity tuning. Although general types of this tuning are similar to findings in biology, the analysis of simulated disparity tuning reveals further discrepancies. So, I frequently observe tuning curves with several peaks indicating that even cells with high preferred spatial frequency have wide RFs, which is uncommon in real neurons. In correspondence with findings in previous sections phase is the dominant disparity encoding principle in my data, but this dominance exceeds by far the one reported in \citet{prince2002}. Furthermore the tendency towards large phase disparities in PA is carried over from the RF level, but is still not found in data from real cells. I conclude that optimally stable cells exhibit response properties to RDS, which are roughly similar to those of binocular neurons, although clear deviations between both can be seen on a closer look.
%\subsection{Further results}
%evaluation methods (RDS)
%histogram: Mono by Uncorrelated (Read and Cumming)
% values of dominant monocular response are consistently greater than values of uncorrelated response, but not much (MUratio nearly normally distributed around 0.85)
% why uncorrelated response has to be smaller than sum of monocular responses:
% \[
% C = \sqrt{\sum_{subunits}(w_l I_l + w_r I_r)^2}
% \]
% % two subunits $w_lI_l=l$, $w_rI_r=r$:
% \[\begin{array}{@{}rcl}
% C &=& \displaystyle \sqrt{(l_1+r_1)^2 + (l_2+r_2)^2}\\
% &=& \displaystyle \sqrt{l_1^2+l_2^2 + r_1^2+r_2^2 + 2l_1r_1 + 2l_2r_2}
% \end{array}\]
% % $2l_1r_1 + 2l_2r_2$ describes correlation between left and right convolutions, is responsible for disparity, therefore $2l_1r_1 + 2l_2r_2$ = disp
% \[
% C = \sqrt{l_1^2+l_2^2 + r_1^2+r_2^2 + disp}
% \]
% % for monocular responses here left (i.e. $r=0$):
% \[
% C_l = \sqrt{l_1^2 + l_2^2}
% \]
% % for monocular responses here right (i.e. $l=0$):
% \[
% C_r = \sqrt{r_1^2 + r_2^2}
% \]
% % for uncorrelated response $disp=0$
% \[
% C_u = \sqrt{l_1^2+l_2^2 + r_1^2+r_2^2}
% \]
% \[
% \sqrt{a+b} \leq \sqrt{a}+\sqrt{b}
% \]
% $\Rightarrow$
% \[
% C_u \leq C_l+C_r
% \]
% -----> without sqrt model would predict that C_u = C_l+C_r
%DDI -> variable discharge of cortical neurons (Shadlen)
%diagram: DDI vs. orientation, DDI vs. ODI, DDI vs. frequency
%Gabor fitting (Numerical Recipes)
%diagram: d0 vs. phi (Prince et al.) -> tuning curve shapes
%diagram: d0 vs. phi in pixel
%diagram: parameter comparison
%diagram: frequency vs. disparity
%diagram: orientation vs. no. cycles in 1SD (Prince a)
%uncorrAct always larger than monocular activity
%diagram: disparity peak frequency vs. spatial frequency (Read and Cumming)
%Vgl.: position vs. phase disparity (+fit parameters vs. RFs)
%diagram: position vs. phase disparity (or vice versa; anzai 1999, but simple cells)
\newpage
\section{Discussion}
\label{sec:discussion}
In this work I have compared properties of binocular complex cells to properties of simulated cells, which exhibit optimally stable activity to binocular natural stimuli in order to examine whether a stability criterion can be used to explain binocular properties of striatal cells.
From infinitely many possibilities the optimisation for stability has selected Gabor wavelets as subunit RF shapes. Consistent with physiological data these subunit RFs have all their properties in common apart from phase, which stands in quadrature relationship between the two subunits of the cell model. Thus, given a slightly modified version of the binocular energy model as skeleton, most optimally stable representations of natural stimuli implement the Gabor energy model. Consequentially, stable cells suffer from the same shortcomings as the Gabor energy model, which are discussed in the following subsection (\ref{sec:diffbem}). On the other hand, the energy model has been successfully used to model certain properties of complex cells. Correspondingly I find several similarities between simulated and striatal cells such as their sensitivity to stimulus orientation and spatial frequency. Furthermore I report that stable cells exhibit elongated regions along the frontoparallel axis in binocular interaction profiles demonstrating that these cells respond to stimuli with a certain disparity independent of their position in the RF, which is a crucial feature of binocular complex cells \citep{anzai1999b}. It has also been found in physiology that two subunits, as used here, are sufficient to represent the responses of most complex cells \citep{anzai1999b}. Additionally the Gabor energy model successfully predicts that tuning curves to random-dot stereograms with different disparity can be described by a 1D-Gabor function. So, this is the case for optimally stable, but also for striatal cells. \citet{prince2002} report that tuning curve shapes form a continuum between particular prototypical shapes, what is comparable with the tuning curves of simulated cells although not all prototypes are as frequent as in real cells.
Quantitative comparisons of binocular properties also suggest many similarities between optimally stable and real cells. Accordingly, all my analyses yield evidence that the distributions of preferred disparity of stable and complex cells have core features in common. Especially, there is a similar dominance of phase encoding in simulation as reported in physiology, which is expressed in a greater range of disparities encoded by a phase mechanism than by a position difference between subunit RFs. Furthermore, in simple cells, which are thought to constitute functional subunits of complex cells, phase and position disparities have been found to be uncorrelated \citep{anzai1999} as observed in subunit RFs from the simulation. In general subunit RF properties and those of simple cells match surprisingly well. For example, in both, simulation and physiology, it has been noticed that RFs with horizontal orientation encode a smaller range of disparities than those with vertical orientation \citep{anzai1999}. A further example, which also extends to the complex cell response level, is a size-disparity correlation where the range of disparities represented depends on the periodicity of the tuning curve. This correlation is due to the dominance of phase encoding, because disparities encoded by this scheme are limited by spatial frequency of the RFs. Correspondingly a size-disparity correlation is evident in simulated and striatal cells \citep{anzai1999,anzai1999b,prince2002}.
% similarities:
% Gabor wavelets
% apart from quadrature phase subunit parameters equal
% sensitivity to orientation, spatial frequency
% insensitive to stimulus position in RF as long as disparity is right
% two major subunits
% Gabor disparity tuning curves,
% all tuning types present
% disparity distributions centred around 0
% phase greater range of disparities than position => mainly phase encoding
% phase and position uncorrelated
% phase dependent on orientation (also difference)
% size-disparity correlation
% dependence of periodicity of tuning curve on orientation
Although the similarities between properties of optimally stable representations of natural stimuli and those of complex cells are striking I also observe several clear discrepancies. Many of them are attributable to the employed cell model, but others may be due to the stability criterion, or the properties of natural stimuli, or may simply reflect inaccuracies in the applied methods. The former are discussed below and the latter in the final subsection (\ref{sec:unresolved}).
\subsection{Differences attributable to the binocular energy model}
\label{sec:diffbem}
As mentioned above, most optimally stable cells implement a Gabor energy model. Therefore every discrepancy appearing between this model and real neurons also holds between stable cells and real neurons. A large part of the literature concerning binocular striatal cells evaluates properties of the energy model and I here discuss some shortcomings brought forward so far.
The energy model consists of two subunits, which stand in quadrature phase relationship, but \citet{anzai1999b} found complex cells requiring more than two functional subunits to explain their binocular responses. \citeauthor{anzai1999b} propose that the extra subunits may be needed to extent the RF of a complex cell in one dimension. But there are more reasons why more than two subunits may be needed. A substantial one has been published by \citet{cumming2002}. He reports that independent of RF orientation a V1 neuron modulates its firing rate over a wider range of horizontal disparity than vertical disparity. In contrast, the energy model predicts that this depends on orientation, which can be observed in simulated cells. Hence, several subunits encoding different horizontal disparities may be needed to be combined in order to yield this preference for horizontal disparity.
Another discrepancy concerns the response to uncorrelated RDS in the two eyes. According to the energy model this has to be larger than the response to a random-dot stimulus presented solely to the dominant eye, because monocular responses are combined linearly in the energy model. Correspondingly, the ratio of ''dominant monocular response'' divided by ''uncorrelated response'' is consistently smaller than one in simulation data, but \citet{read2003} found many striatal cells where this ratio was significantly greater than one and \citet{prince2002a} also report that the uncorrelated response is close to the mean of the monocular responses. Therefore, \citet{read2002,read2003} put forward a modification of the basic energy model in which threshold nonlinearities are included before monocular activities are combined. Such a model allows for consistent inhibitory input from one eye and thus a response to uncorrelated stimuli could be smaller than a response to a stimulus presented only to the dominant eye. The proposed modification also addresses another limitation of the basic energy model, which predicts that the response to anticorrelated RDS\footnote{black is white, white is black in the different eyes} is the exact inverted version of the response to correlated RDS \citep{read2002}. On the contrary it has been found that even though the response is inverted in V1 neurons the modulation of the response is weaker.
Finally, RFs of neurons are known to extent not only in space, but also in time \citep{deangelis2003}. The here presented framework has no time dimension and for that reason no time dependent properties of neurons such as tuning to direction of a moving stimulus can be modelled.
% more than two subunits
% additional violation of energy model: \citet{Cum02} see also uka_currentbio_02 horizontal disparity.pdf
% response to uncorrelated RDS bigger than dominant monocular
% => no completely inhibitory input from one eye (read2), maybe binocular response normalisation (prince1)
% weaker response to contrast inverted stimuli (thresholded simple cells before binocular combination, prince1, read1, read2)
% no space-time RFs
% no end- and side-inhibition
% dependence DDI orientation (all problems of DDI transferred to unresolved)
\subsection[Other discrepancies]{Other discrepancies and optimally stable representations of natural stimuli}
\label{sec:unresolved}
Not all differences between optimally stable and striatal cells can be ascribed to the employed cell model. Although the cell model supports sensitivity to disparity, the optimisation process was free to build RFs, which do not result in disparity tuning of a cell. Nevertheless I find that all simulated cells significantly modulate their activity to stimuli with different disparities while there is a substantial subset of neurons in V1 for which this is not the case. However, it should be noticed that all cells in the simulation receive equally strong input from both eyes whereas in striatal cells the strengths of monocular inputs may be very different. Optimal stability has also been found to well describe the response of monocular complex cells \citep{koerding2004}. Thus the question of disparity tuned or not is likely to depend on the balance of monocular inputs.
Disparity tuning of simulated cells has been tested with RDS. The resulting activities of stable cells display many different characteristics compared to the activities of striatal neurons. Especially variances of the responses to different RDS with equal disparity are much larger in simulation and increase quadratically with the mean while the ratio of variance to mean has been observed to be constant in physiology. It has been proposed that it is beneficial for neurons in a network to maintain such a constant ratio of variance and mean \citep{shadlen1998} and that they may use certain mechanisms to achieve this, which are apparently missing in the model. Because variances in simulation and physiology are so different, the DDI taking them into account also exhibits some divergences between both, which are not further discussed here.
There is a marked absence of RFs with oblique orientations in the simulation data. This clearly reflects properties of natural stimuli, in which oblique orientations are not as frequent as horizontal or vertical ones, too \citep{einhaeuser2004}. The phenomenon is called the oblique effect and it has also been observed in physiological and psycho-physical experiments \citep{einhaeuser2004}, but as is demonstrated in figure \ref{fig:ddicomp}E,F the effect is extremely strong in stable cells whereas it is not evident in a large data set of monkey striatal cells from \citet{prince2002a}. Why the effect is completely absent in this data set while it can be seen in others is not clear to me.
Another discrepancy related to RF orientation arises from \citet{anzai1999b}'s finding that in components of binocular interaction profiles of horizontal, complex cells a large range of disparity\footnote{orthogonal to orientation, thus vertical} is encoded. Although I observe a similar trend in the simulation, it still exhibits a preference to encode a smaller range of vertical disparity as it is expected when monocular inputs are laterally displaced. Interestingly, the deficiency in vertical disparity is clearer if subunit RFs are analysed directly. Therefore, it could be that the method applied in this case reduces the dependence of encoded disparity range on RF orientation in simulation and physiology. Furthermore a proper evaluation of \citeauthor{anzai1999b}'s finding is difficult, because the data set is rather sparse in the considered regions.
Above I stated as a similarity that there is a size-disparity correlation between RF spatial frequency and range of encoded disparities. On the other hand, I do not find a ''size-size'' correlation between RF spatial frequency and the width of the RFs, or formulated differently, RF sizes do not change with RF spatial frequency and rather extend over their whole possible space, which equals the size of their inputs (20x20 px). As a result the number of cycles within disparity tuning curves of stable cells is increased in comparison with striatal cells \citep{prince2002a}. From a stability perspective it makes sense to expand RFs over the whole patch, since then it is possible that the response to a small stimulus remains equal, when the stimulus changes its position to another part of the input or RF, respectively, while a position shift to an empty part of the RF will extinguish the activity of a cell in any case. Thus, whether complex cells are adapted to their natural stimuli to be most stable also depends on whether cells with high spatial frequency RFs only receive input from a limited part of the visual field, or more generally, whether RFs extend over the same part of the visual field as their combined input. Unfortunately, this is extremely hard to investigate.
The most critical observation that I make through all my analyses of preferred disparity in stable cells is a considerably greater number of large phase disparities (near $\pm\pi$) than found in physiology \citep{anzai1999,anzai1999b,prince2002} expressed in more different RF shapes between the two eyes and more TI tuning curves. \citet{goldbach2004} also found this tendency in a similar simulation, which implemented a sparseness instead of a stability criterion. As a consequence the effect might be due to the properties of natural visual stimuli, because it is apparently independent of the applied objective. In return, it needs to be investigated whether opposite contrast in the monocular stimuli is similarly frequent as large phase disparity in optimised cells. However such a finding would not help explaining the minority of phase disparities near $\pm\pi$ in striatal neurons, which would even be counterintuitive, if an adaptation of neurons to their natural stimuli is assumed. The issue remains unresolved, but it also contributes to a further deviation of simulation and physiology. So I notice that phase encodes a larger range of disparities compared to position in stable cells, which has been reported from striatal cells, too, except that in V1 the ratio of phase divided by position differs when contrasted to simulation data. This difference is highly dependent on the employed method, but it tends to be larger in optimally stable cells, which could be explained by the higher number of phases near $\pm\pi$. Yet, this relationship still needs to be proven quantitatively.
% all simulated cells disparity selective, but not all real cells (improper combination of subunits, prince1)
% => DDI, variances
% lack of oblique orientations (the oblique effect, Einhäuser)
% dependence of disparity magnitude on orientation (anzai)
% phase with overrepresented tails (more TI)
% small pos vs. large phase in px: influence of phase bigger since phases greater than in physiology
% RFs über ganzen patch even with high frequency
% => more cycles per SD
\\
\\
In conclusion, I find striking similarities between stable and complex cells signifying that also binocular complex cells can be described as forming optimally stable representations of natural visual input. However, the cell model should be adapted to incorporate recent advances in the modelling of striatal neurons and further investigations need to be done to explain an open discrepancy concerning phase disparity.
\newpage
\section*{Acknowledgements}
I am grateful to Peter König and Selim Onat under whose supervision I learnt an awful lot about data analysis in the neurosciences. I thank Markus Goldbach for being a great fellow and Daniel Weiller for fruitful discussions. I also thank numerous friends such as Robert Freund, Stephan Weller and Egon Stemle for providing the needed antipole to do this work. Last but not least, I am grateful to my grandparents Maria and Rolf Voigt for their loving support of my studies.
% \textcolor{magenta}{}
%\input{d0VSphi}
%\includegraphics[natwidth=740,natheight=843,width=300pt,clip=true]{disparity.jpg}
\newpage
\bibliographystyle{abbrvnat}
\bibliography{brefs}
\newpage
\appendix
\section{Source code}
Principal Matlab functions and scripts, which I have written during the work on this thesis, are listed below. I have organised them into the categories Gabor fitting, RF analysis, interaction profiles, RDS and displaying.
\subsection*{Gabor fitting}
\subsubsection*{fitGabor1f.m}
\verbatiminput{source/fitGabor1f.m}
\subsubsection*{gabor1.m}
\verbatiminput{source/gabor1.m}
\subsubsection*{fitGabor2.m}
\verbatiminput{source/fitGabor2.m}
\subsubsection*{gabor2.m}
\verbatiminput{source/gabor2.m}
\subsection*{RF analysis}
\subsubsection*{grating.m}
\verbatiminput{source/grating.m}
\subsubsection*{findOri.m}
\verbatiminput{source/findOri.m}
\subsubsection*{fitGaborToRF.m}
\verbatiminput{source/fitGaborToRF.m}
\subsubsection*{RFAnalysis.m}
\verbatiminput{source/RFAnalysis.m}
\subsection*{Interaction profiles}
\subsubsection*{iProfile.m}
\verbatiminput{source/iProfile.m}
\subsubsection*{SVDAnalysis.m}
\verbatiminput{source/SVDAnalysis.m}
\subsubsection*{RFAnalysisSVD.m}
\verbatiminput{source/RFAnalysisSVD.m}
\subsection*{RDS}
\subsubsection*{genRDS.m}
\verbatiminput{source/genRDS.m}
\subsubsection*{RDStoActSingle.m}
\verbatiminput{source/RDStoActSingle.m}
\subsubsection*{RDStoAct.m}
\verbatiminput{source/RDStoAct.m}
\subsubsection*{calcActs.m}
\verbatiminput{source/calcActs.m}
\subsubsection*{fitRDS.m}
\verbatiminput{source/fitRDS.m}
\subsection*{Displaying}
\subsubsection*{showRFAnalysis.m}
\verbatiminput{source/showRFAnalysis.m}
\subsubsection*{showPrince.m}
\verbatiminput{source/showPrince.m}
\subsubsection*{showAllRDS.m}
\verbatiminput{source/showAllRDS.m}
\end{document}