First Effort Step Binary ProjectionStarted with Binary Random Projection of Features, Samples
Created a binary random projection that will be used in a kNN Hanning function for hamming distances on nearest neighbors that will be processed by computing the bin count of the y_train. Prior to this kNN hanning function is necessary to define binary random projection for the X_train and x_test.
Second Effort Step Specify features shape for y_train (112,)Next fixed specifications for the binary random projection:
Fixed the specification for binary random projection to ensure it computes the structure of features and number of samples that is representative as samples in X-train (255,112) features=255, samples=112, to the samples as contained in the y_train(112,) labels.
Third Effort on Binary Projection
When my effort to updated binary_random_projection, reversing the original code (X, param) to (X_stft_matrix, project_matrix) and dot(project_matrix.T, X_stft_matrix) versed matrix positions. I ended up getting another error (Error after binary_random_projection reversing)
def binary_random_projection(X_stft_matrix, project_matrix): Y = np.dot(project_matrix.T, X_stft_matrix) Y_binary = np.where(np.abs(Y) > 0, 1, 0) return Y_binary.astype(int)
Problem in kNN_Hanning that resulted in KNN prediction all zerosFollowing the KHH_Hanning function, the result of the prediction variable (following the Binary Random Project) resulting in zero prediction for all y_train(112,) labels.
Main code for kNN prediction
This code on kNN follow the binary projection of X_train, x_test, Y_train (binary). The expected outcome is the in the k loop (for k in K_list) and the m loop (for m in K-list) should produce that accuracy from the measure of accuracy in np.mean(y_pred == y_test).
I have unit tested this code before adding the binary project functions and this kNN in the below loops had produced correct results for kNN prediction accuracy. So something in the the create_random_projection_matrix() or binary_random_projection() has disrupted the accuracy computation.
L_list = [100,150, 200, 300,400,600] # Different choices of L K_list = [60,80,120,180,240,300] # Different choices of K for kNNaccuracy_df = pd.DataFrame(columns=['Accuracy', 'K'])results = [] for L in L_list: parmTrain = create_random_projection_matrix(L=255, M=112) X_train_binary = binary_random_projection(matrix_train_stft, parmTrain) parmTest = create_random_projection_matrix(L=255, M=28) X_test_binary = binary_random_projection(matrix_test_stft, parmTest) for k in K_list: pred = kNN_hamming(X_train_binary, X_test_binary, y_train, k) #y_distances, nearest_neighbors_idxs = kNN_hamming(X_train_binary, X_test_binary, y_train, k) accuracy = np.mean(y_pred == y_test) print(f"L={L}, K={k}, Accuracy={accuracy:.2f}") for m in K_list: y_pred = kNN_hamming(X_train_binary, X_test_binary, y_train, m) accuracy = np.mean(y_pred == y_test) results.append({'Accuracy': accuracy, 'K':m})
Trace results:
L=100, K=60, Accuracy=0.00L=100, K=80, Accuracy=0.00L=100, K=120, Accuracy=0.00L=100, K=180, Accuracy=0.00L=100, K=240, Accuracy=0.00L=100, K=300, Accuracy=0.00L=150, K=60, Accuracy=0.00L=150, K=80, Accuracy=0.00L=150, K=120, Accuracy=0.00L=150, K=180, Accuracy=0.00L=150, K=240, Accuracy=0.00L=150, K=300, Accuracy=0.00L=200, K=60, Accuracy=0.00L=200, K=80, Accuracy=0.00L=200, K=120, Accuracy=0.00L=200, K=180, Accuracy=0.00L=200, K=240, Accuracy=0.00L=200, K=300, Accuracy=0.00L=300, K=60, Accuracy=0.00L=300, K=80, Accuracy=0.00L=300, K=120, Accuracy=0.00
Binary Projection Code functionsThese were added after unit testing above worked, then I unit tested this and the result was prediction accuracy zero for all y_train(112,). The full code with print comment trace include:
def create_random_projection_matrix(L, M): projection_matrix = np.random.randn(L, M) projection_matrix /= np.linalg.norm(projection_matrix, axis=1)[:, np.newaxis] return projection_matrixdef binary_random_projection(X, param): Y = np.dot(X, param.T) Y_binary = np.where(np.abs(Y) > 0, 1, 0) return Y_binary.astype(int)def kNN_hamming(X_train_binary, X_test_binary, y_train, k): print('---kNN_hamming---') print('X_train_binary:', X_train_binary.shape) print('y_train:', y_train.shape) print('X_test_binary:', X_test_binary.shape) y_train_flat = y_train.flatten() print('y_train_flat', y_train_flat.shape) distances = cdist(X_test_binary, X_train_binary, metric='hamming')[:112,0] print('distances:', distances.shape) nearest_neighbors_idxs = np.argsort(distances, axis=0)[:k] #nearest_neighbors_idxs = np.argsort(distances, axis=1)[:,:k] y_pred = y_pred = np.array([np.argmax(y_train_flat[idx]) for idx in nearest_neighbors_idxs]) return y_pred
Error after binary_random_projection reversing
def binary_random_projection(X_stft_matrix, project_matrix): Y = np.dot(project_matrix.T, X_stft_matrix) Y_binary = np.where(np.abs(Y) > 0, 1, 0) return Y_binary.astype(int)Resulted in ValueError:---> 22 distances = cdist(X_test_binary, X_train_binary, metric='hamming')[:112,0] 23 print('distances:', distances.shape) 24 nearest_neighbors_idxs = np.argsort(distances, axis=0)[:k]File ~\anaconda3\Lib\site-packages\scipy\spatial\distance.py:2986, in cdist(XA, XB, metric, out, **kwargs) 2984 raise ValueError('XB must be a 2-dimensional array.') 2985 if s[1] != sB[1]:-> 2986 raise ValueError('XA and XB must have the same number of columns ' 2987 '(i.e. feature dimension.)') 2989 mA = s[0] 2990 mB = sB[0]ValueError: XA and XB must have the same number of columns (i.e. feature dimension.)
Here is the STFT data (matrix_train_stft) on X_train, which was the feed input to the kNN_hamming function:
matrix_train_stft (255, 112) [[0.20493472+0.j 0.28272246+0.j 0.64385114+0.j ... 0.35044909+0.j 0.31872643+0.j 0.26226732+0.j] [0.3087083 +0.j 0.15318839+0.j 0.41855107+0.j ... 0.38959425+0.j 0.32697115+0.j 0.45987799+0.j] [0.54496816+0.j 0.22221813+0.j 0.39610509+0.j ... 0.36975602+0.j 0.06623323+0.j 0.19927445+0.j] ... [0.14595922+0.j 0.36853684+0.j 0.5001006 +0.j ... 0.07091054+0.j 0.49812299+0.j 0.54165295+0.j] [0.49533511+0.j 0.49681426+0.j 0.52401434+0.j ... 0.13344165+0.j 1.38174332+0.j 0.63528698+0.j] [0.44414767+0.j 0.2128674 +0.j 0.56310035+0.j ... 0.22561237+0.j 0.64175042+0.j 0.50903301+0.j]]
The outcome trace of the function kNN_hamming shows the detailed shape and dimensions with execution of the code. For example, matching the above Accuracy zero output, this is what was happening on the inside of kNN_hamming() function for this run.
---kNN_hamming---X_train_binary: (255, 255)y_train: (112,)X_test_binary: (255, 255)y_train_flat (112,)distances: (112,)L=100, K=60, Accuracy=0.00---kNN_hamming---X_train_binary: (255, 255)y_train: (112,)X_test_binary: (255, 255)y_train_flat (112,)distances: (112,)L=100, K=80, Accuracy=0.00---kNN_hamming---X_train_binary: (255, 255)y_train: (112,)X_test_binary: (255, 255)y_train_flat (112,)distances: (112,)L=100, K=120, Accuracy=0.00---kNN_hamming---X_train_binary: (255, 255)y_train: (112,)X_test_binary: (255, 255)y_train_flat (112,)distances: (112,)L=100, K=180, Accuracy=0.00---kNN_hamming---X_train_binary: (255, 255)y_train: (112,)X_test_binary: (255, 255)y_train_flat (112,)distances: (112,)L=100, K=240, Accuracy=0.00---kNN_hamming---X_train_binary: (255, 255)y_train: (112,)X_test_binary: (255, 255)y_train_flat (112,)distances: (112,)
Data Samples
Sample X_train
array([[[-0.01074219, -0.0234375 , -0.01025391, ..., -0.11767578, 0.04931641, 0.01171875], [-0.00830078, 0.02441406, -0.015625 , ..., -0.10400391, 0.03320312, -0.02587891], [ 0.00634766, 0.0546875 , 0.01806641, ..., -0.15332031, -0.03222656, -0.01660156]], [[ 0.03466797, 0.01660156, -0.10742188, ..., -0.11669922, -0.00048828, -0.05615234], [-0.02197266, -0.00390625, -0.02685547, ..., -0.10595703, 0.04931641, -0.0546875 ], [-0.00146484, 0.03271484, -0.05224609, ..., -0.11132812, 0.02783203, -0.04345703]], [[-0.02441406, 0.01611328, -0.14013672, ..., -0.06933594, -0.01806641, -0.17041016], [-0.04882812, -0.01464844, -0.07080078, ..., -0.07666016, 0.07421875, -0.10498047], [-0.03369141, -0.01855469, -0.11474609, ..., -0.12646484, 0.07519531, -0.07617188]]
Sample X_train Binary
X_train_binary: (255, 255)[[1 1 1 ... 1 1 1] [1 1 1 ... 1 1 1] [1 1 1 ... 1 1 1] ... [1 1 1 ... 1 1 1] [1 1 1 ... 1 1 1] [1 1 1 ... 1 1 1]]
Sample y_train
array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])