python - Dropping infinite values from dataframes in pandas?

ID : 10002

viewed : 40

Tags : pythonpandasnumpy

Top 5 Answer for python - Dropping infinite values from dataframes in pandas?

vote vote

99

The simplest way would be to first replace() infs to NaN:

df.replace([np.inf, -np.inf], np.nan, inplace=True) 

and then use the dropna():

df.replace([np.inf, -np.inf], np.nan, inplace=True) \     .dropna(subset=["col1", "col2"], how="all") 

For example:

In [11]: df = pd.DataFrame([1, 2, np.inf, -np.inf])  In [12]: df.replace([np.inf, -np.inf], np.nan, inplace=True) Out[12]:     0 0   1 1   2 2 NaN 3 NaN 

The same method would work for a Series.

vote vote

82

With option context, this is possible without permanently setting use_inf_as_na. For example:

with pd.option_context('mode.use_inf_as_na', True):     df = df.dropna(subset=['col1', 'col2'], how='all') 

Of course it can be set to treat inf as NaN permanently with

pd.set_option('use_inf_as_na', True) 

For older versions, replace use_inf_as_na with use_inf_as_null.

vote vote

78

Use (fast and simple):

df = df[np.isfinite(df).all(1)] 

This answer is based on DougR's answer in an other question. Here an example code:

import pandas as pd import numpy as np df=pd.DataFrame([1,2,3,np.nan,4,np.inf,5,-np.inf,6]) print('Input:\n',df,sep='') df = df[np.isfinite(df).all(1)] print('\nDropped:\n',df,sep='') 

Result:

Input:     0 0  1.0000 1  2.0000 2  3.0000 3     NaN 4  4.0000 5     inf 6  5.0000 7    -inf 8  6.0000  Dropped:      0 0  1.0 1  2.0 2  3.0 4  4.0 6  5.0 8  6.0 
vote vote

62

Here is another method using .loc to replace inf with nan on a Series:

s.loc[(~np.isfinite(s)) & s.notnull()] = np.nan 

So, in response to the original question:

df = pd.DataFrame(np.ones((3, 3)), columns=list('ABC'))  for i in range(3):      df.iat[i, i] = np.inf  df           A         B         C 0       inf  1.000000  1.000000 1  1.000000       inf  1.000000 2  1.000000  1.000000       inf  df.sum() A    inf B    inf C    inf dtype: float64  df.apply(lambda s: s[np.isfinite(s)].dropna()).sum() A    2 B    2 C    2 dtype: float64 
vote vote

52

The above solution will modify the infs that are not in the target columns. To remedy that,

lst = [np.inf, -np.inf] to_replace = {v: lst for v in ['col1', 'col2']} df.replace(to_replace, np.nan) 

Top 3 video Explaining python - Dropping infinite values from dataframes in pandas?

Related QUESTION?