I have text containing assorted transactions that I'm trying to parse using regex.
Text looks like this:
JT Meta Platforms, Inc. - Class ACommon Stock (META) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000F S: NewS O: Morgan Stanley - Select UMA Account # 1JT Microsoft Corporation - CommonStock (MSFT) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000F S: NewS O: Morgan Stanley - Select UMA Account # 1JT Microsoft Corporation - CommonStock (MSFT) [OP]P 02/13/2024 03/05/2024 $500,001 -$1,000,000F S: NewS O: Morgan Stanley - Portfolio Management Active Assets AccountD: Call options; Strike price $170; Expires 01/17 /2025C: Ref: 044Q34N6I've created a regex to parse out individual transactions, denoted by combination of ticker (eg, (MSFT)), type (eg, [ST], [OP]) and amount (eg, $500,000, etc) as follows:
transactions = rx.findall(r"\([A-Z][^$]*\$[^$]*\$[,\d]+", text)Transactions are returned as a list and look like this for example:
(META) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000I'd like to add logic to include description details (ie, 'D:...') if they exist. I tried with the below pattern, but it winds up returning just one large transaction since the first two transactions don't have description details (ie, 'D:').
I'd like to see this:
(META) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000..
(MSFT) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000..
(MSFT) [OP]P 02/13/2024 03/05/2024 $500,001 -$1,000,000F S: NewS O: Morgan Stanley - Portfolio Management Active Assets AccountD: Call options; Strike price $170; Expires 01/17 /2025What am I doing wrong?
rx.findall(r"\([A-Z][^$]*\$[^$]*\$[,\d]+[\s\S]*?D:(.*)", text)