Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

How to get a list of all indices of repeated elements in a numpy array

$
0
0

I'm trying to get the index of all repeated elements in a numpy array, but the solution I found for the moment is REALLY inefficient for a large (>20000 elements) input array (it takes more or less 9 seconds).The idea is simple:

  1. records_arrayis a numpy array of timestamps (datetime) from which we want to extract the indexes of repeated timestamps

  2. time_array is a numpy array containing all the timestamps that are repeated in records_array

  3. records is a django QuerySet (which can easily converted to a list) containing some Record objects. We want to create a list of couples formed by all possible combinations of tagId attributes of Record corresponding to the repeated timestamps found from records_array.

Here is the working (but inefficient) code I have for the moment:

tag_couples = [];for t in time_array:    users_inter = np.nonzero(records_array == t)[0] # Get all repeated timestamps in records_array for time t    l = [str(records[i].tagId) for i in users_inter] # Create a temporary list containing all tagIds recorded at time t    if l.count(l[0]) != len(l): #remove tuples formed by the first tag repeated        tag_couples +=[x for x in itertools.combinations(list(set(l)),2)] # Remove duplicates with list(set(l)) and append all possible couple combinations to tag_couples

I'm quite sure this can be optimized by using Numpy, but I can't find a way to compare records_array with each element of time_array without using a for loop (this can't be compared by just using ==, since they are both arrays).


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>