Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14011

Why Python ignores the sign in the column name?

$
0
0

I have a dataframe of text that looks like this

RepID, Txt1, +83 -193 -380 +55 +9012, -94 +44 +2892 -603, +7010 -3840 +3993

Although the Txt field have +282 and -829 but these are string values not numeric

The problem is that when I use Bag of words function

def BOW(df):  CountVec = CountVectorizer() # to use only  bigrams ngram_range=(2,2)  Count_data = CountVec.fit_transform(df)  Count_data = Count_data.astype(np.uint8)  cv_dataframe=pd.DataFrame(Count_data.toarray(), columns=CountVec.get_feature_names_out(), index=df.index)  # <- HERE  return cv_dataframe.astype(np.uint8)

I get the result columns without any sign + or -

the outcome is

RepID 83 193 380 55 ...1     1  1   1   12     0  0   0   0

it should be

RepID +83 -193 -380 +55 ...1     1   1    1    12     0   0    0    0

Why is that and how to fix it?


Viewing all articles
Browse latest Browse all 14011

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>