Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23189

Parse string with specific characters if they exist using regex

$
0
0

I have text containing assorted transactions that I'm trying to parse using regex.

Text looks like this:

JT Meta Platforms, Inc. - Class ACommon Stock (META) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000F S: NewS O: Morgan Stanley - Select UMA Account # 1JT Microsoft Corporation - CommonStock (MSFT) [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000F S: NewS O: Morgan Stanley - Select UMA Account # 1JT Microsoft Corporation - CommonStock (MSFT) [OP]P 02/13/2024 03/05/2024 $500,001 -$1,000,000F S: NewS O: Morgan Stanley - Portfolio Management Active Assets AccountD: Call options; Strike price $170; Expires 01/17 /2025C: Ref: 044Q34N6

I've created a regex to parse out individual transactions, denoted by combination of ticker (eg, (MSFT)), type (eg, [ST], [OP]) and amount (eg, $500,000, etc) as follows:

transactions = rx.findall(r"\([A-Z][^$]*\$[^$]*\$[,\d]+", text)

Transactions are returned as a list and look like this for example:

(META)  [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000

I'd like to add logic to include description details (ie, 'D:...') if they exist. I tried with the below pattern, but it winds up returning just one large transaction since the first two transactions don't have description details (ie, 'D:').

I'd like to see this:

(META)  [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000

..

(MSFT)  [ST]S (partial) 02/08/2024 03/05/2024 $1,001 - $15,000

..

(MSFT) [OP]P 02/13/2024 03/05/2024 $500,001 -$1,000,000F S: NewS O: Morgan Stanley - Portfolio Management Active Assets AccountD: Call options; Strike price $170; Expires 01/17 /2025

What am I doing wrong?

rx.findall(r"\([A-Z][^$]*\$[^$]*\$[,\d]+[\s\S]*?D:(.*)", text)

Viewing all articles
Browse latest Browse all 23189

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>