The groupby method in pandas is one of the most powerful tools for aggregating and summarizing data. It allows you to split data into groups based on some criteria, perform computations on these groups, and combine the results into a new DataFrame.

Key Steps of groupby

  1. Split: The data is divided into groups based on a key or multiple keys (column values).

  2. Apply: A function is applied to each group independently.
  3. Combine: The results are combined into a single data structure.

Syntax

df.groupby(by, axis=0, level=None, as_index=True)
 

  • by: The column(s) or index level(s) to group by.
  • axis: Defaults to 0 (rows). You can group by columns if axis=1.
  • level: For MultiIndex, group by a specific level.
  • as_index: If True, group labels become the index in the result. Set to False to retain the original index.

 

Example

import pandas as pd

df = pd.read_csv('../DataSets/usedcars.csv')
df.groupby(by=['color','model']).describe()