GroupBy.
diff
First discrete difference of element.
Calculates the difference of a DataFrame element compared with another element in the DataFrame group (default is the element in the same column of the previous row).
Periods to shift for calculating difference, accepts negative values.
See also
pyspark.pandas.Series.groupby
pyspark.pandas.DataFrame.groupby
Examples
>>> df = ps.DataFrame({'a': [1, 2, 3, 4, 5, 6], ... 'b': [1, 1, 2, 3, 5, 8], ... 'c': [1, 4, 9, 16, 25, 36]}, columns=['a', 'b', 'c']) >>> df a b c 0 1 1 1 1 2 1 4 2 3 2 9 3 4 3 16 4 5 5 25 5 6 8 36
>>> df.groupby(['b']).diff().sort_index() a c 0 NaN NaN 1 1.0 3.0 2 NaN NaN 3 NaN NaN 4 NaN NaN 5 NaN NaN
Difference with previous column in a group.
>>> df.groupby(['b'])['a'].diff().sort_index() 0 NaN 1 1.0 2 NaN 3 NaN 4 NaN 5 NaN Name: a, dtype: float64