Supersymmetric Artificial Neural Network

Thought Curvature or the "Supersymmetric Artificial Neural Network" hypothesis, (accepted to the 2019 String Theory and Cosmology Conference GRC ) is a Lie Superalgebra bound algorithmic learning model, on the horizon of evidence pertaining to Supersymmetry in the biological brain.

It was introduced by Jordan Micah Bennett on May 10, 2016.

"Thought Curvature" or the "Supersymmetric Artificial Neural Network (2016)" is reasonably observable as a new branch or field of Deep Learning in Artificial Intelligence, called Supersymmetric Deep Learning, by Bennett. Supersymmetric Artificial Intelligence (though not Deep Gradient Descent-like machine learning) can be traced back to work by Czachor et al, concerning a single section/four paragraph thought experiment via segment "Supersymmetry and dimensional Reduction" on a so named "Supersymmetric Latent Semantic Analysis (2004)" based thought experiment; i.e. supersymmetry based single value decomposition, absent neural/gradient descent. Most of that paper apparently otherwise focusses on comparisons between non supersymmetric LSA/Single Value Decomposition, traditional Deep Neural Networks and Quantum Information Theory. Biological science/Neuroscience saw application of supersymmetry, as far back as 2007 by Perez et al. (See reference 3 from Bennett's paper )

Method
Notation 1 - Manifold Learning: $$ \phi \big(x; \theta \big)^{\top}w $$

Notation 2 - Supermanifold Learning: $$ \phi \big(x; \theta, \bar \big)^{\top}w $$

Instead of some $$\theta$$ neural network representation as is typical in mean field theory or manifold learning models , the Supersymmetric Artificial Neural Network is parameterized by the Supersymmetric directions $$\theta, \bar$$.

An informal proof of the representation power gained by deeper abstractions of the “Supersymmetric Artificial Neural Network”
Machine learning non-trivially concerns the application of families of functions that guarantee more and more variations in weight space. This means that machine learning researchers study what functions are best to transform the weights of the artificial neural network, such that the weights learn to represent good values for which correct hypotheses or guesses can be produced by the artificial neural network.

The 'Supersymmetric Artificial Neural Network' is yet another way to represent richer values in the weights of the model; because supersymmetric values can allow for more information to be captured about the input space. For example, supersymmetric systems can capture potential-partner signals, which are beyond the feature space of magnitude and phase signals learnt in typical real valued neural nets and deep complex neural networks respectively. As such, a brief historical progression of geometric solution spaces for varying neural network architectures follows:

1. An optimal weight space produced by shallow or low dimension integer valued nodes or real valued artificial neural nets, may have good weights that lie for example, in one simple $(Z^n or R^n-ordered)$ cone per class/target group.

2. An optimal weight space produced by deep and high-dimension-absorbing real valued artificial neural nets, may have good weights that lie in disentangleable $(R^n*R^n-ordered)$ manifolds per class/target group convolved by the operator $*$, instead of the simpler regions per class/target group seen in item (1). (This may guarantee more variation in the weight space than (1), leading to better hypotheses or guesses)

3. An optimal weight space produced by shallow but high dimension-absorbing complex valued artificial neural nets, may have good weights that lie in multiple $(C^n-ordered)$ sectors per class/target group, instead of the real regions per class/target group seen amongst the prior items. (This may guarantee more variation of the weight space than the previous items, by learning additional features, in the “phase space”. This also leads to better hypotheses/guesses)

4. An optimal weight space produced by deep and high dimension-absorbing complex valued artificial neural nets, may have good weights that lie in chi distribution bound, $(\mathbb{C}^n*\mathbb{C}^n-ordered)$ rayleigh space per class/target group convolved by the operator *, instead of the simpler sectors/regions per class/target group seen amongst the previous items. (This may guarantee more variation of the weight space than the prior items, by learning phase space representations, and by extension, strengthen these representations via convolutional residual blocks. This also leads to better hypotheses/guesses)

5. The 'Supersymmetric Artificial Neural Network' operable on high dimensional data, may reasonably generate good weights that '''lie in disentangleable $(C^{\infty} \big(R^{m

6. n}\big)-ordered)$ supermanifolds per class/target group, instead of the solution geometries seen in the prior items above. Supersymmetric values can encode rich partner-potential''' delimited features beyond the phase space of (4) in accordance with cognitive biological space, where (4) lacks the partner potential formulation describable in Supersymmetric embedding.

Naive Architecture for the “Supersymmetric Artificial Neural Network"
Following, is another view of “solution geometry” history, which may promote a clear way to view the reasoning behind the subsequent naive architecture sequence:

1. There has been a clear progression of “solution geometries”, ranging from those of the ancient Perceptron to unitaryRNN's, complex valued neural nets or grassmann manifold artificial neural networks. These models may be denoted by $\phi(x;\theta)^{\top}w$ parameterized by $\theta$, expressible as geometrical groups ranging from orthogonal to special unitary group based: $SO(n)$ to $SU(n)$..., and they got better at representing input data i.e. representing richer weights, thus the learning models generated better hypotheses or guesses.

2. By “solution geometry” I mean simply the class of regions where an algorithm's weights may lie, when generating those weights to do some task.

3. As such, if one follows cognitive science, one would know that biological brains may be measured in terms of supersymmetric operations. (“Supersymmetry at brain scale” )

4. These supersymmetric biological brain representations can be represented by supercharge compatible special unitary notation $SU(m

5. n)$, or $\phi(x;\theta$, $\bar)^{\top}w$ parameterized by $\theta$, $\bar$, which are supersymmetric directions, unlike $\theta$ seen in item (1). Notably, Supersymmetric values can encode or represent more information than the prior classes seen in (1), in terms of “partner potential” signals for example.

6. So, state of the art machine learning work forming $U(n)$ or $SU(n)$ based solution geometries, although non-supersymmetric, are already in the family of supersymmetric solution geometries that may be observed as occurring in biological brain or $SU(m

7. n)$ supergroup representation.

The “Edward Witten/String theory powered artificial neural network”, is simply an artificial neural network that learns supersymmetric weights.

Looking at the above progression of ‘solution geometries’; going from $$SO(n)$$ representation to $$SU(n)$$ representation has guaranteed richer and richer representations in weight space of the artificial neural network, and hence better and better hypotheses were generatable. It is only then somewhat natural to look to $$SU(m|n)$$ representation, i.e. the “Edward Witten/String theory powered artificial neural network” (“Supersymmetric Artificial Neural Network”).

To construct an “Edward Witten/String theory powered artificial neural network”, it may be feasible to compose a system, which includes a grassmann manifold artificial neural network then generate ‘charts’ until scenarios occur where the “Edward Witten/String theory powered artificial neural network” is achieved, in the following way:

See points 1 to 5 in this reference

It seems feasible that a $$C^{\infty}$$ bound atlas-based learning model, where said $$C^{\infty}$$ is in the family of supermanifolds from supersymmetry, may be obtained from a system, which includes charts $$( _k^n) $$of grassmann manifold networks $$GR_{k,n}$$ and stiefel manifolds $$GF_{k,n}$$, in $$(\phi_I,U_I)$$terms, where there exists some invertible submatrix $$A \in \phi_I (U_I \cap U_J)$$ for $$U_I = \pi(V_i)$$ entailing matrix for where $$\pi $$is a submersion mapping on some stiefel manifold $$GF_{k,n}$$, thereafter enabling some differentiable grassmann manifold $$GR_{k}({\mathbb{R}^n})$$, and $$V_I = \big\{u \in \mathbb{R}^{n \times k} : det(u_I) \neq 0\big\}$$.

Artificial Neural Network/Symmetry group landscape visualization
1. $$ O \big(n\big) $$ structure – Orthogonal is not connected enough, therefore not amenable to gradient descent in machine learning. (Paper: See note 2 at end of page 2, in reference .)

2. $$ SO \big(n\big) $$ structure – Special Orthogonal; is connected, gradient descent compatible, while preserving orthogonality, concerning normal space-time. (Paper: See paper in item 1).

3. $$ SU \big(n\big) $$ structure – Special Unitary; is connected, gradient descent compatible; complex generalization of $$ O \big(n\big) $$, but only a subspace of larger unitary space, concerning normal space-time. (The Unitary Evolution Recurrent Neural Network related to complex unit circle seen in $$ SU \big(1\big) $$ in physics (See page 2 in (See page 7 in ).))

4. $$ U \big(n\big) $$ structure – Unitary; is connected, gradient descent compatible; Larger unitary landscape than $$ SU \big(n\big) $$, concerning normal space-time.

5. $$ SU \big(m|n\big) $$ structure –  Supersymmetric; is connected, thereafter reasonably gradient descent compatible and even larger than the $$ U \big(n\big) $$ landscape, to permit sparticle invariance, being a Poincare group extension (See page 7 in ) containing both normal space-time and anti-commuting components, as seen in the Supersymmetric Artificial Neural Network which this page proposes.

Ending Remarks
Pertinently, the “Edward Witten/String theory powered supersymmetric artificial neural network”, is one wherein supersymmetric weights are sought. Many machine learning algorithms are not empirically shown to be exactly biologically plausible, i.e. Deep Neural Network algorithms, have not been observed to occur in the brain, but regardless, such algorithms work in practice in machine learning.

Likewise, regardless of Supersymmetry's elusiveness at the LHC, as seen above, it may be quite feasible to borrow formal methods from strategies in physics even if such strategies are yet to show related physical phenomena to exist; thus it may be pertinent/feasible to try to construct a model that learns supersymmetric weights, as I proposed throughout this paper, following the progression of solution geometries going from $$SO(n)$$ to $$SU(n)$$ and onwards to $$SU(m|n)$$.