I have a column that has phone numbers. They are usually formatted in (555) 123-4567 but sometimes they are in a different format or they are not proper numbers. I am trying to convert this field to have just the numbers, removing any non-numeric characters (if there are 10 numbers).
How can I apply a function that says if there are 10 numbers in this field, extract just the numbers?
I tried to use:
df['PHONE'] = df['PHONE'].str.extract('(\d+)', expand=False)But this just extracts the first chunk of numbers (the area code). How do I pull all the numbers and only run this extraction if there are exactly 10 numbers in the field?
My expected output would be 5551234567