Sort values in each row of polars df

Tags:

If your data size isn’t too big, here’s an easy solution. Let’s say you have this data:

import polars as pl
import numpy as np

df = pl.DataFrame({
    "A": [3, 4, 2],
    "B": [1, 5, 0],
    "C": [0, 1, 3]}
)

Use this function to sort values across chosen columns but within each row.

def sort(df, cols):
    orig_cols = df.columns
    rows = []
    for r in df.select(cols).rows():
        rows.append(sorted(r))
    rows = list(map(list, zip(*rows)))  # transpose
    sorted_df = pl.DataFrame(
        rows, {k: v for k, v in df.schema.items() if k in cols}
    )
    other_df = df.select(pl.col("*").exclude(cols))
    df = pl.concat((sorted_df, other_df), how="horizontal")
    return df.select(orig_cols)

As an example, if we sort across [“A”, “C”]:

print(sort(df, ["A", "C"]))
shape: (3, 3)
┌─────┬─────┬─────┐
│ A   ┆ B   ┆ C   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 0   ┆ 1   ┆ 3   │
│ 1   ┆ 5   ┆ 4   │
│ 2   ┆ 0   ┆ 3   │
└─────┴─────┴─────┘