I have a list of sorted files:
files = [file_1, file_2, file_3, file_4, file_5, file_6, file_7, file_8, file_9, file_10]
And there is a string whose presence I need to determine in these files. Usually this string is found in a certain range. For example, from file_1 to file_4, or from file_3 to file_10.
At the moment, I am looping through this list and opening each file in turn and checking for the presence of the string in it. But this is slow, because there are thousands of strings to search for, and there are more than 50 files in which it should be searched.
So I came up with an idea to search for a range not one by one, but using a binary search.
As I see it:
The first and last files in the list are opened first. If a string is present in both of them, then the range for this string is the whole list of files. If the string is present in the first file, but not in the last one, then the first file becomes the beginning of the range, and the end is now the file in the middle of the list. And so on.
I've tried to implement it according to the principle of the well-known binary search algorithm, but realized that their implementation does not coincide at all. But I don't have enough brains to write such an algorithm.
Can anyone help me? At least with pseudocode.