If your data size isn’t too big, here’s an easy solution. Let’s say you have this data:
import polars as pl
import numpy as np
df = pl.DataFrame({
"A": [3, 4, 2],
"B": [1, 5, 0],
"C": [0, 1, 3]}
)
Use this function to sort values across chosen columns but within each row.
def sort(df, cols):
orig_cols = df.columns
rows = []
for r in df.select(cols).rows():
rows.append(sorted(r))
rows = list(map(list, zip(*rows))) # transpose
sorted_df = pl.DataFrame(
rows, {k: v for k, v in df.schema.items() if k in cols}
)
other_df = df.select(pl.col("*").exclude(cols))
df = pl.concat((sorted_df, other_df), how="horizontal")
return df.select(orig_cols)
As an example, if we sort across [“A”, “C”]:
print(sort(df, ["A", "C"]))
shape: (3, 3)
┌─────┬─────┬─────┐
│ A ┆ B ┆ C │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 0 ┆ 1 ┆ 3 │
│ 1 ┆ 5 ┆ 4 │
│ 2 ┆ 0 ┆ 3 │
└─────┴─────┴─────┘