I have a Pandas dataframe such as this:
#Date run angle NAME #----- _______ ________ _______ 2023-02-15 10:00:00 120716 -1.75493 4.5x10-4 Al 40um2023-02-15 10:38:48 120716 -1.75493 JD70-103 50um 0/90 deg2023-02-15 18:25:41 120723 -0.658 JD70-103 50um 45/135 degI am trying to read the .txt file using a regex separator that matches any number of whitespaces, unless preceded by a date ("\d\d:\d\d:\d\d"), "Al", "\d\dum", "deg", or something that looks like "[0-999]/[0-999]":
df = pd.read_csv("file.txt", engine='python', sep='\s+(?!\d\d:\d\d:\d\d|Al|\d\dum|deg|((\d|\d\d|\d\d\d)\/(\d|\d\d|\d\d\d)))')
For some reason, this is creating a dataframe that inserts three columns of NaN values in between each of my desired columns:
Date NaN None.1 None.2 run None.3 None.4 None.5 angle ...0 2023-02-15 10:00:00 NaN NaN NaN 120716 NaN NaN NaN -1.75493 ...1 2023-02-15 10:38:48 NaN NaN NaN 120716 NaN NaN NaN -1.75493 ...2 2023-02-15 18:25:41 NaN NaN NaN 120723 NaN NaN NaN -0.658 ...Any idea why this is happening? My best guess is that the seperator is separating multiple whitespaces in a row, but that is not the behavior I would expect from the regex I mentioned above.