Quantum ML models for periodic, spherical, and 3-D rotational data

| 9 min read

In the a previous post about quantum variational inference on polynomial models, I showed how you can infer expectation values & uncertainties of discrete random variables E\mathscr{E} by regressing log-probability data (V()V(\cdot)) and finding the eigenvalues of the equation

[12^2+V^]Ψ=EΨ\left[ -\frac{1}{2} \hat{\nabla}^2 + \hat{V} \right] \ket{\Psi} \,=\, \mathscr{E} \ket{\Psi}

using variational inference. But very few data can be reliably fit to a polynomial. Like, for example, periodic data such as daily power usage, bioavailability medication taken daily, or a chemical bond rotation. Or like latitudinal and longitudinal data, e.g. positions on a globe or the orientation of a linear molecule. Or, even more complicated, like latitudinal and longitudinal AND periodic data – like 3-D pose detection data or the orientation of a non-linear molecule.

This post is about how to perform the same kind of basis function regression and matrix diagonalization to these more complicated cases. I’ll look at quantum models involving


  1. Regression to a Fourier series expansion for N-D periodic data (Θ=θ\Theta=\theta),
  2. Regression to a spherical harmonics expansion for spherical data (Θ=θ,Φ=ϕ\Theta=\theta, \Phi=\phi), and
  3. Regression to a Wigner D-matrix series expansion for 3-D rotational data (Φ=ϕ,Θ=θ,X=χ\Phi=\phi,\Theta=\theta,\Chi=\chi).

Let’s do some Quantum ML.

Regression & Quantum Variational Inference on a Fourier Expansion

If we have a function VV of the periodic input data Θ=θ\Theta=\theta, we can find a representation of V(θ)V(\theta) by taking a linear combination of sines and cosines (a Fourier series)

V(θ)=n=0Ncnψn(θ)V(\theta) \,=\, \sum_{n=0}^{N} c_n \psi_n(\theta)

ψn(θ)={12πn=01πcos(n+12θ)n odd1πsin(n2θ)n even\psi_n(\theta) \,=\, \begin{cases} \frac{1}{\sqrt{2\pi}} & n = 0 \\ \frac{1}{\sqrt{\pi}} \cos\left(\frac{n+1}{2}\theta\right) & n \text{ odd} \\ \frac{1}{\sqrt{\pi}} \sin\left(\frac{n}{2}\theta\right) & n \text{ even} \end{cases}


class FourierSeries(torch.nn.Module):
    """
    Trainable 2*pi-periodic Fourier series
    """
    def __init__(self, order=8):
        super().__init__()
        self.c = torch.nn.Parameter(torch.zeros(1 + 2*order), requires_grad=True)
    
    def forward(self, x):
        y = 0.
        for n,c in enumerate(self.c):
            y += c*self._fourier_fn(x, n)
        return y

    def _fourier_fn(self, x, n):
        if n == 0:
            # zero / constant
            return np.power(2*np.pi, -0.5)*torch.ones_like(x)
        elif n % 2: 
            # odd n / cos
            return np.power(np.pi, -0.5)*torch.cos(0.5*(n+1)*x)
        else: 
            # even n / sin
            return np.power(np.pi, -0.5)*torch.sin(0.5*n*x)

Conveniently, we can also use these as basis functions for a quantum model, since we know

θn=ψn(θ)mn=δmn\begin{aligned} \braket{\theta | n} &\,=\, \psi_n(\theta) \\ \braket{m | n} &\,=\, \delta_{mn} \end{aligned}

θ2HRingψn(θ)=12{(n+12)2n odd(n2)2n even}ψn(x)n0\begin{aligned} \underbrace{\nabla_\theta^2}_{H_\text{Ring}} \psi_n (\theta) &\,=\, \frac{1}{2} \begin{rcases} \begin{dcases} \left(\frac{n+1}{2}\right)^2 & n \text{ odd} \\ \\ \left(\frac{n}{2} \right)^2 & n \text{ even} \\ \end{dcases} \end{rcases} \psi_n(x) &\forall n \geq 0 \end{aligned}

HRing=12[01144N2N2]\mathbf{H}_\text{Ring} \,=\, \frac{1}{2}\begin{bmatrix} {0} & & & & & & & \\ & {1} & & & & & & \\ & & {1} & & & & & \\ & & & {4} & & & & \\ & & & & {4} & & & \\ & & & & & {\ddots} & & \\ & & & & & & N^2 & \\ & & & & & & & N^2 \\ \end{bmatrix}

Confused? See previous post. In code, we can define a Fourier regression model similar to the polynomial regression model as so


class Ring(QuantumBasis):
    def __init__(self, order):
        super().__init__(order)

    def potential(self, x):
        return 0 * x

    def basis_fn(self, x, n):
        if n == 0: 
            # zero / constant
            return np.power(2*np.pi, -0.5)*torch.ones_like(x)
        elif n % 2: 
            # odd n / cos
            return np.power(np.pi, -0.5)*torch.cos(0.5*(n+1)*x)
        else: 
            # even n / sin
            return np.power(np.pi, -0.5)*torch.sin(0.5*n*x)

    @staticmethod
    def H_basis(order):
        return 0.5*torch.diag(torch.cat([torch.zeros(1), 
                                         torch.arange(1, order+1).repeat_interleave(2)]).square())

So if we were to model the response variable V(x)V(x) from samples of an input xx using a similar quantum mechanics example as the previous post, we could solve the following equations in a perfectly analogous way

122Ψ(x)+V(x)Ψ(x)=EΨ(x)-\frac{1}{2} \nabla^2 \Psi ({x}) + V({x}) \Psi ({x}) \,=\, \mathscr{E} \Psi({x})

P(E=En)=eβEnmeβEmP(\mathscr{E}=E_n) \,=\, \frac{e^{-\beta E_n}}{\sum_m e^{-\beta E_m}}

P(X=xE=En)=Ψn(x)2P(X=x \,|\, \mathscr{E}=E_n ) \,=\, \left| \Psi_n(x) \right|^2

P(X)=nP(XE=En)P(E=En)P(X) \,=\, \sum_n P(X \,|\, \mathscr{E}=E_n) P(\mathscr{E}=E_n)

I’ll lay out this process for some sample data for a periodic feature XX and its response VV:

# Define the "true" potential energy
V_periodic = lambda x: 0.1*np.sin(x)*(1-60*np.cos(3*x)-40*np.cos(2*x))

# Generate sample data with random jitter
inc = np.pi/32 # choose a grid size
sample_x = np.arange(-np.pi, np.pi, inc)
sample_y = V_periodic(sample_x) + np.random.normal(0,1,len(sample_x))

# Use regression to fit the data in the Hermite polynomial basis
coeffs, fit_fn = periodic_fit(sample_x, sample_y, max_iter=50, tol=0.2)

# Then plot the potential, data, and fit

Note, that if we were to continually increase the bandwidth NN we would certainly be able to recreate all data points with 100% accuracy. However, this would lead to overfitting. To avoid this, we choose a conservative value of R2R^2 to abort the regression at an appropriate bandwidth. If we wanted to be cleverer, we could add a regularization term or do heavier analysis on the choice of NN.

To solve the PDE using the data fitted by Fourier regression to find the eigenvalues E\mathscr{E} (up to a convergent normalization constant), we construct the matrix

Hmn=H0,mn+02πdXΦm(X)V(X)Φn(X)\mathbf{H}_{mn} \,=\, \mathbf{H}_{0,mn} + \int_0^{2\pi} \text{d}{X} \Phi_m(X) V(X) \Phi_n(X)

where H0\mathbf{H}_0 is the known solution to the PDE in the Fourier basis with V=0V=0, which is a matrix with eigenvalues on the diagonal

H0,mn={12(m+1)2δmnm odd12m2δmnm even\mathbf{H}_{0,mn} \,=\, \begin{cases} \frac{1}{2} \left(m+1\right)^2 \delta_{mn} && m \text{ odd} \\ \frac{1}{2} m^2 \delta_{mn} && m \text{ even} \end{cases}

In code, we then have

def H_mat_p(size, fit_fn):
"""
A Hamiltonian matrix element in the Fourier basis.
Inputs:
- basis set size
- fit_fn: a fn of (φ) for the fit [Hartree]
- I: moment of inertia [amu*angstrom*angstrom]
"""

H = np.zeros(shape=(size,size))
for m in range(size):
for n in range(m, size):
if m == n:
# Set diagonal equal to free particle on a ring (POR) soln.
# Ej = j^2/2
j = 0.5*(m+1) if m % 2 else 0.5*m
H[m,n] += 0.5 * j * j
# Do the integral to solve potential contributions given the fit
intgrl, err = scipy.integrate.quad(lambda x: psi_fourier(m,x)*fit_fn(x)*psi_fourier(n,x),
-np.pi, np.pi)
H[m,n] += intgrl # add contribution
H[n,m] += intgrl
return H
>>> H = H_mat_p(20, fit_fn)
>>> eig,vec = np.linalg.eigh(H)

The distribution of XX and statistics of E\mathscr{E} can be visualized just as before:

If this were a quantum chemistry problem, the desired statistics (mean, variance, entropy) can be straightforwardly computed by the same protocol as in the polynomial regression example.

Regression on a Spherical Harmonics Expansion

The process of modeling spherical data is ridiculously similar, with a Fourier-esque expansion possible using spherical harmonic functions Ym(θ,ϕ)Y^\ell_m(\theta,\phi). Unlike a 1-D Fourier series, however, spherical data is parameterized in two dimensions, latitude ϕ\phi and longitude θ\theta, and will have two integers encoding its coefficients,

f(θ,ϕ)=0Lm=amYm(θ,ϕ)f(\theta,\phi) \,\approx\, \sum_{\ell=0}^{L}\sum_{m=-\ell}^\ell a^{\ell}_m Y^{\ell}_m(\theta,\phi)

Ym(θ,ϕ)={(1)m22+14π(m)!+m)!Pm(cosθ)sin(mϕ)m<02+14πPm(cosθ)m=0(1)m22+14π(m)!+m)!Pm(cosθ)cos(mϕ)m>0Y^{\ell}_m (\theta, \phi) \,=\, \begin{cases} (-1)^m \sqrt{2} \sqrt{\frac{2\ell + 1}{4\pi} \frac{(\ell-|m|)!}{\ell +|m|)!}} \mathscr{P}^{|m|}_\ell (\cos \theta) \sin(|m|\phi) && m < 0 \\ \sqrt{\frac{2\ell + 1}{4\pi}} \mathscr{P}^{m}_\ell (\cos\theta) && m = 0 \\ (-1)^m \sqrt{2} \sqrt{\frac{2\ell + 1}{4\pi} \frac{(\ell-m)!}{\ell +m)!}} \mathscr{P}^{m}_\ell (\cos \theta) \cos(m\phi) && m > 0 \end{cases}

where the functions Pm(x)\mathscr{P}^m_\ell (x) are the Legendre polynomials. These can be visualized in the following table, courtesy of Wikipedia:

Despite the daunting look of these equations, it’s actually quite easy to build a linear regression model using already-existing functions from modules like scipy.special, like so

from scipy.special import sph_harm as Y

def spherical_fit(xpts, ypts, max_iter=20, tol=0.05):
"""
Do a periodic fit if in the Laplace basis
Inputs:
- xpts: the sampling positions (2D)
- ypts: the sampling values at the sampling positions
- max_iter: the max number of fitting iterations
- tol: the tolerance of the convergence (deviation from R^2 = 1)
Outputs:
- coeffs: the polynomial fitting coefficients in ascending order from 0
- fit_fn: the fitting function associated with these coefficients
"""

# Define spherical fitting function
f = lambda l,m,th,ph: np.sqrt(2)*np.real(Y(m,l,ph,th)) if m > 0 else \
np.sqrt(2)*np.imag(Y(m,l,ph,th)) if m < 0 else \
np.real(Y(m,l,ph,th))
lmax = 1
prevR2 = 0
while lmax < max_iter:
inds = [] # Create 1-D array of all indices within our bandwidth
for l in range(lmax):
for m in range(-l, l+1):
inds.append([l, m])
h = np.array([[f(ind[0],ind[1],xpt[0],xpt[1]) for ind in inds] for xpt in xpts])
u,s,v = np.linalg.svd(h, full_matrices=False) # Singular value decomposition
coeffs = np.matmul( np.linalg.pinv(np.matmul(u, np.matmul( np.diag(s), v))), np.array(ypts))
predicted = np.dot(coeffs, h.T)
# Tolerance defined as deviation from R^2 = 1.
if np.abs(prevR2 - 1) < tol:
fit_fn = np.vectorize(lambda th,ph: np.sum([coeff*f(ind[0],ind[1],th,ph) for ind,coeff in zip(inds, coeffs)]))
return coeffs, fit_fn
lmax += 1;
prevR2 = r2_score(ypts, predicted)
raise

To demonstrate how this works, I’ll use actual data that I really used in one of my projects to describe the orientation of a molecule, which is parameterized by three features: latitude X1=θX_1=\theta, longitude X2=ϕX_2=\phi, and polar rotation X3=χX_3=\chi. Here is a useful image for visualizing this concept:

To draw a parallel to more common iterations of this kind of dataset, we can think of the data as representing global coordinates (latitudes and longitudes on Earth) with yearly, monthly, or daily periodicity encoded by χ\chi. With that in mind, let’s make a model that assumes the latitudes and longitudes at an instant in time, say χ\chi = 0.0:

>>> df = pd.read_csv('/path/to/wigner_data.csv')    # Read in the "Wigner" data
>>> sph_df = df[df['X3'] == 0.00] # Select only χ=0 data
>>> sample_x = np.array([np.array(sph_df.X1), np.array(sph_df.X2)]).T
>>> sample_y = np.array(sph_df.Y)
>>> coeffs, fit_fn = spherical_fit(sample_x, sample_y) # fit the data

With spherical data, the result of the regression can be difficult to visualize, since it requires either mapping a projection of a sphere or simultaneously viewing podal and antipodal perspectives of a globe. Both of these options are luckily available when working with global data with a module called cartopy, and we can easily adapt them for our model.

import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

def generate_plottable_data(fn, shape=(73, 135)):
"""Return ``lons``, ``lats`` and ``data`` of some fake data."""
nlats, nlons = shape
lats = np.linspace(-np.pi/2, np.pi/2, nlats)
lons = np.linspace(0, 2 * np.pi, nlons)
lons, lats = np.meshgrid(lons, lats)
# lat . = θ from π -> -π
# long. = φ from 0 -> 2π
# f = f(θ,φ)
data = fn(lats+np.pi/2, lons) # latitude vs polar angle conventions
lats = np.rad2deg(lats)
lons = np.rad2deg(lons)
return lons, lats, data

def spherical_plot(lons, lats, data):
# Some parameters:
PHI, THETA = 45,45 # Viewing perspective angles
lvls = np.arange(-2,22) # This is data-specific, not general
cmap = 'viridis' # Nice colors

fig = plt.figure(figsize=(10,10))
ax1 = fig.add_subplot(121, projection=ccrs.Orthographic(PHI, THETA))
ax2 = fig.add_subplot(122, projection=ccrs.Orthographic(PHI-180, -THETA))
ax1.gridlines(color='black', linestyle='--')
ax2.gridlines(color='black', linestyle='--')
ax1.set_global()
ax2.set_global()

# Plot contours and countor lines, with labels, on BOTH plots
filled_c1 = ax1.contourf(lons, lats, data,
transform=ccrs.RotatedPole(pole_latitude=-90, pole_longitude=50),
cmap=cmap, levels=lvls)
line_c1 = ax1.contour(lons, lats, data,
transform=ccrs.RotatedPole(pole_latitude=-90, pole_longitude=50),
colors=['black'], levels=filled_c.levels)
ax1.clabel(
line_c1, # Typically best results when labelling line contours.
colors=['black'],
manual=False, # Automatic placement vs manual placement.
inline=True, # Cut the line where the label will be placed.
fmt=' {:.0f} '.format, # Labes as integers, with some extra space.
)
filled_c2 = ax2.contourf(lons, lats, data,
transform=ccrs.RotatedPole(pole_latitude=-90, pole_longitude=50),
cmap=cmap, levels=lvls)
line_c2 = ax2.contour(lons, lats, data,
transform=ccrs.RotatedPole(pole_latitude=-90, pole_longitude=50),
colors=['black'], levels=filled_c2.levels)
ax2.clabel(
line_c2, # Typically best results when labelling line contours.
colors=['black'],
manual=False, # Automatic placement vs manual placement.
inline=True, # Cut the line where the label will be placed.
fmt=' {:.0f} '.format, # Labes as integers, with some extra space.
)
>>> lons,lats,data = generate_plottable_sph_data(fit_fn)
>>> spherical_plot(lons,lats,data)

Running this code, we have a nice representation of the spherical harmonics regression to an R2R^2 > 95%. Given any latitude and longitude, we can predict the value of our response variable given only the data at χ\chi = 0.

Regression & Quantum Variational Inference on a Wigner D-Matrix Expansion

If the data we use were shown to not depend on the third feature χ\chi, our work would be done with the spherical harmonics regression. And if this were a quantum example, we would then solve the PDE as usual using the known solutions in the spherical basis. However, we can clearly see from our data that our response variable depends on χ\chi, and should therefore be included in a more rigorous regression model that accounts for this third feature.

Fortunately, this is possible by expansion using another group of functions called the Wigner D-matrices (the “D” stands for Darstellung, German for “representation”). I’ve seen this expansion called the SO(3) Fourier transform (SOFT) in the literature, but to give a little more credit to Eugene Wigner I’ll call it the Wigner transform/regression. As the intelligent reader may have anticipated, this expansion will include coefficients indexed by three integers since we now have three parameters to fit. The expansion will look like this

f(ϕ,θ,χ)=0Lm=k=cmkDmk(ϕ,θ,χ)f(\phi,\theta,\chi) \,\approx\, \sum_{\ell=0}^{L} \sum_{m=-\ell}^{\ell} \sum_{k=-\ell}^{\ell} c^\ell_{mk} D^\ell_{mk} (\phi,\theta,\chi)

Dmk=(+k)!(k)!(+m)!(m)!sinkmθ2cosk+mθ2P^kkm,k+m(cosθ)eiϕeiχD^{\ell}_{mk} \,=\, \sqrt{\frac{(\ell+k)!(\ell-k)!}{(\ell+m)!(\ell-m)!}}\sin^{k-m}\frac{\theta}{2}\cos^{k+m}\frac{\theta}{2}\hat{\mathscr{P}}^{k-m,k+m}_{\ell-k}(\cos\theta)e^{-i\phi}e^{-i\chi}

where P^na,b\hat{\mathscr{P}}^{a,b}_n are the generalized Legendre (Jacobi) polynomials. Clearly, this is more complex (no pun intended) than the spherical harmonics expansion, and it’s also not readily available in mainstream modules like scipy.special. Nevertheless, I was able to find two nice modules from Mike Boyle named spherical and quaternionic that contained all of the functions needed for computing a Wigner regression:

import spherical
import quaternionic

def wigner_fit(xpts, ypts, max_iter=20, tol=0.05):
lmax = 1
prevR2 = 0
while lmax < max_iter:
# Build a matrix of Wigner D matrix elements
# for every sampling point, we will need to compute D-matrices at that point.
# The D-matrices scale cubically.
Dbasis = []
for pt in xpts:
# pt = [θ, φ, χ]
R = quaternionic.array.from_euler_angles(pt[1], pt[0], pt[2]) # generate quaternion
wigner = spherical.Wigner(lmax); # create D-matrix at maximum bandwidth
D = wigner.D(R); # evaluate D-matrix at angles defined by quaternion
Dbasis.append(D);
Dbasis = np.array(Dbasis);
U,S,V = np.linalg.svd(Dbasis, full_matrices=False)
ahat = np.matmul( np.linalg.pinv(np.matmul(U, np.matmul( np.diag(S), V))), np.array(ypts))

# Check the fit
# Reminder: we are working with complex data types,
# meaning the coefficients and matrices will be complex.
# However, the predicted vals will be real!
# First calculate what the regression will predict at each datapoint
predicted = np.zeros(len(ypts), dtype=np.complex128);
for i,pt in enumerate(xpts):
R = quaternionic.array.from_euler_angles(pt[1], pt[0], pt[2]);
wigner = spherical.Wigner(lmax);
D = wigner.D(R);
predicted[i] += np.dot(ahat, D);
# Then compute the R^2 value, ensuring >95% fit
if np.abs(prevR2 - 1) < 0.05:
fit_fn = np.vectorize(lambda phi,theta,chi: np.real(np.dot(ahat, spherical.Wigner(lmax).D(
quaternionic.array.from_euler_angles(phi,theta,chi)))))
return ahat, fit_fn
lmax += 1;
prevR2 = r2_score(ypts, np.real(predicted)) # make sure we have real data to compare
raise

Then we can generate a plot with the Wigner regression, and compare it against the spherical harmonics regression to see how incorporation of a third parameter changes the prediction:

def generate_plottable_wig_data(fn, chi, shape=(73, 135)):
"""
Return ``lons``, ``lats`` and ``data`` of regression result.
Inputs:
- fn the function we created from wigner_fit
- chi float, the value of chi at which to analyze the function
"""

nlats, nlons = shape
lats = np.linspace(-np.pi/2, np.pi/2, nlats)
lons = np.linspace(0, 2 * np.pi, nlons)
lons, lats = np.meshgrid(lons, lats)
# lat . = θ from π -> -π
# long. = φ from 0 -> 2π
# f = f(φ,θ,χ)
data = fn(lons, lats+np.pi/2, chi) # latitude vs polar angle conventions
lats = np.rad2deg(lats)
lons = np.rad2deg(lons)
return lons, lats, data
>>> wsample_x = np.array([df.X1, df.X2, df.X3]).T
>>> wsample_y = np.array(df.Y)
>>> wcoeffs, wfit_fn = wigner_fit(wsample_x, wsample_y)
>>> lons,lats,data = generate_plottable_wig_data(wfit_fn, chi=0)
>>> spherical_plot(lons,lats,data)

This is a more accurate depiction of our data that accounts for the variation of the feature X3=χX_3=\chi. As an added measure, we can create several of these regression plots at different values of χ\chi and turn them into a gif

which is the header to this post. I think this kind of visualization makes the relationship between periodic variables and time cycles like days, months, years a lot clearer. Just like the spherical and periodic data, we can calculate through eigendecomposition linear PDEs using the method of variations. The mathematics behind this using Wigner D-matrix basis functions is extremely dense, so I won’t go into it here. But a curious reader might take interest in Chapter IV and Appendix C of my dissertation, linked at the top of the page, where I go into how to do this in much more detail.

In my next post, I will find global periodic data and use Wigner regression to make predictions.