Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

Is there a good "elastic search" in python for filtering a list of items?

$
0
0

I have written a gui tool for our warehouse which filters a list of recently 5000 products (each a class). Now I am basically searching if there is something faster / better to use.

Here is a Example of the data:

apple = Product()apple.name = "Apple, Red"apple.ean = "12345"apple.art_nr = "A001"apple.price = 1.00   # the price is not used for the searchcherry = Product()cherry.name = "Cherry, Red"cherry.ean = "66550"cherry.art_nr = "A345"cherry.price = 1.45banana = Product()banana.name = "Banana"banana.ean = "12359"banana.art_nr = "A002"banana.price = 0.50list_of_products = [apple, cherry, banana]

My function now walks through each item in the list and compares attributes to a search string.

search_results = list()search_str = "345"# assuming "elastic_match_Product_attributes" returns a boolean # if the Product attributes match the search criteriafor item in list_of_products:   if elastic_match_Product_attributes(item, search_str):       search_results.append(item)

In this example the result will show Cherry (match in art_nr) and Apples (match in ean).

When searching for "Red Apple" I also do find one result ("Apple, Red"), because I'm splitting the search string into sub strings (separated by a space) and all substrings must match (is this behavior called an "elastic search" ?).

Hence the searchstring "Red ," will match two items (Apple & Cherry), while "Red," will not match anything.

Searching for "50" will only return "Cherry" (ean match) - but not Bananas (price) as the price is not part of the searched attributes.

So far everything is working as expected - yet I am wondering what search engines are already out there which do this job better (and faster) than my rather simple loop over a list of items and compare each item.

I tried pandas dataframes, but searching (with a loop) was 10 times slower than the lists (I know: one should not loop a dataframe - slow performance)

Maybe even support for features like weighting of the columns, fuzzy search, indexed data for a speedup ...

Programming this gui a year ago, I found a few things, but nothing was really handy at the time.(sorry, the "results" are somewhere lost in my endless bookmark list)

Recently I stumbled upon Keyvi and was wondering if it might be suited for the task (I'd probably need to reconstruct the data though) and be faster.

Surly someone had this problem before and can toss me a simple keyword or even a package.
Or show a more efficient way instead of looping a few thousand items.
Or suggest how to implement a fuzzy search or any other useful feature.

Would appreciate some suggestions. Thx and happy new year!


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>