EDIT 2: this post seems to have been moved from CrossValidated to StackOverflow due to it being mostly about programming, but that means by fancy MathJax doesn't work anymore. Hopefully this is still readable.

Say I want to to calculate the squared Mahalanobis distance between two vectors x and y with covariance matrix S. This is a fairly simple function defined by

M2(x, y; S) = (x - y)^T * S^-1 * (x - y)

With python's numpy package I can do this as

# x, y = numpy.ndarray of shape (n,)# s_inv = numpy.ndarray of shape (n, n)diff = x - yd2 = diff.T.dot(s_inv).dot(diff)

or in R as

diff <- x - yd2 <- t(diff) %*% s_inv %*% diff

In my case, though, I am given

m by n matrix X
n-dimensional vector mu
n by n covariance matrix S

and want to find the m-dimensional vector d such that

d_i = M2(x_i, mu; S)  ( i = 1 .. m )

where x_i is the ith row of X.

This is not difficult to accomplish using a simple loop in python:

d = numpy.zeros((m,))for i in range(m):    diff = x[i,:] - mu    d[i] = diff.T.dot(s_inv).dot(diff)

Of course, given that the outer loop is happening in python instead of in native code in the numpy library means it's not as fast as it could be. $n$ and $m$ are about 3-4 and several hundred thousand respectively and I'm doing this somewhat often in an interactive program so a speedup would be very useful.

Mathematically, the only way I've been able to formulate this using basic matrix operations is

d = diag( X' * S^-1 * X'^T )

where

 x'_i = x_i - mu

which is simple to write a vectorized version of, but this is unfortunately outweighed by the inefficiency of calculating a 10-billion-plus element matrix and only taking the diagonal... I believe this operation should be easily expressible using Einstein notation, and thus could hopefully be evaluated quickly with numpy's einsum function, but I haven't even begun to figure out how that black magic works.

So, I would like to know: is there either a nicer way to formulate this operation mathematically (in terms of simple matrix operations), or could someone suggest some nice vectorized (python or R) code that does this efficiently?

BONUS QUESTION, for the brave

I don't actually want to do this once, I want to do it k ~ 100 times. Given:

m by n matrix X
k by n matrix U
Set of n by n covariance matrices each denoted S_j (j = 1..k)

Find the m by k matrix D such that

D_i,j = M(x_i, u_j; S_j)

Where i = 1..m, j = 1..k, x_i is the ith row of X and u_j is the jth row of U.

I.e., vectorize the following code:

# s_inv is (k x n x n) array containing "stacked" inverses# of covariance matricesd = numpy.zeros( (m, k) )for j in range(k):    for i in range(m):        diff = x[i, :] - u[j, :]        d[i, j] = diff.T.dot(s_inv[j, :, :]).dot(diff)

Vectorizing code to calculate (squared) Mahalanobis Distiance

BONUS QUESTION, for the brave

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List