Skip to content
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS
Close
Beyond Knowledge Innovation

Beyond Knowledge Innovation

Where Data Unveils Possibilities

  • Home
  • AI & ML Insights
  • Machine Learning
    • Supervised Learning
      • Introduction
      • Regression
      • Classification
    • Unsupervised Learning
      • Introduction
      • Clustering
      • Association
      • Dimensionality Reduction
    • Reinforcement Learning
    • Generative AI
  • Knowledge Base
    • Introduction To Python
    • Introduction To Data
    • Introduction to EDA
  • References
HomeKnowledge BasePythonHow-to: give a specific sorting order to categorical…
Python

How-to: give a specific sorting order to categorical values

February 7, 2024February 7, 2024CEO 179 views

In pandas, you can give a specific sorting order to categorical values by creating a categorical variable with an ordered category. Here’s an example:

import pandas as pd

# Sample DataFrame
data = {'day': ['Monday', 'Wednesday', 'Friday', 'Tuesday', 'Thursday']}
df = pd.DataFrame(data)

# Define the custom order for sorting
custom_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']

# Convert 'day' column to a categorical variable with custom order
df['day'] = pd.Categorical(df['day'], categories=custom_order, ordered=True)

# Sort the DataFrame based on the custom order
df = df.sort_values('day')

# Display the sorted DataFrame
print(df)
         day
0     Monday
3    Tuesday
1  Wednesday
4   Thursday
2     Friday

In this example:

  1. We create a DataFrame with a ‘day’ column containing days of the week in a random order.
  2. We define a custom order for sorting (‘custom_order’).
  3. We convert the ‘day’ column to a categorical variable using pd.Categorical with the specified custom order and setting ordered=True.
  4. We use df.sort_values('day') to sort the DataFrame based on the custom order.
  5. The resulting DataFrame will have rows sorted according to the custom order of the ‘day’ column.

This can be useful when you want to ensure that certain operations, such as sorting or plotting, take into account the natural order of the days of the week.

Drawing a graph before adding sort order to the week_day categorical value:

# Assume pickup_dt is object . We can change the data type of pickup_dt to date-time format.
df['pickup_dt'] = pd.to_datetime(df['pickup_dt'], format="%d-%m-%Y %H:%M")

# Now we can extract date parts from pickup date
df['start_year'] = df.pickup_dt.dt.year # extracting the year from the date
df['start_month'] = df.pickup_dt.dt.month_name() # extracting the month name from the date
df['start_hour'] = df.pickup_dt.dt.hour # extracting the hour from the time
df['start_day'] = df.pickup_dt.dt.day # extracting the day from the date
df['week_day'] = df.pickup_dt.dt.day_name() # extracting the day of the week from the date

# let's draw a lineplot for week_day and pickups
plt.figure(figsize=(15,7))
sns.lineplot(data=df, x="week_day", y="pickups", ci=False, color="red", estimator='sum')
plt.ylabel('Total pickups')
plt.xlabel('Weeks')
plt.show()

After adding sort order to the week_day categorical value:

cats = ['Monday', 'Tuesday', 'Wednesday','Thursday', 'Friday', 'Saturday', 'Sunday']
df.week_day = pd.Categorical(df.week_day, ordered=True, categories=cats)

plt.figure(figsize=(15,7))
sns.lineplot(data=df, x="week_day", y="pickups", ci=False, color="red", estimator='sum')
plt.ylabel('Total pickups')
plt.xlabel('Weeks')
plt.show()
categorical, clean, dataset, pandas, preprocessing, python

Post navigation

Previous Post
Previous post: How-to: cap/clip outliers in a column
Next Post
Next post: What is Gaussian Distribution?

You Might Also Like

No image
Delete a folder in Google Colab
June 20, 2024 Comments Off on Delete a folder in Google Colab
No image
Quantile-based discretization of continuous variables
April 29, 2024 Comments Off on Quantile-based discretization of continuous variables
No image
CDF plot of Numerical columns
March 12, 2024 Comments Off on CDF plot of Numerical columns
No image
Standardizing features by StandardScaler
March 11, 2024 Comments Off on Standardizing features by StandardScaler
No image
Get a random sample from your dataset
March 7, 2024 Comments Off on Get a random sample from your dataset
  • Recent
  • Popular
  • Random
  • No image
    7 months ago Low-Rank Factorization
  • No image
    7 months ago Perturbation Test for a Regression Model
  • No image
    7 months ago Calibration Curve for Classification Models
  • No image
    March 15, 20240Single linkage hierarchical clustering
  • No image
    April 17, 20240XGBoost (eXtreme Gradient Boosting)
  • No image
    April 17, 20240Gradient Boosting
  • No image
    April 23, 2024Undersampling Technique – Tomek Links
  • No image
    April 7, 2024BaggingClassifier from Scikit-Learn
  • No image
    March 15, 2024Single linkage hierarchical clustering
  • Implementation (55)
    • EDA (4)
    • Neural Networks (10)
    • Supervised Learning (26)
      • Classification (17)
      • Linear Regression (8)
    • Unsupervised Learning (11)
      • Clustering (8)
      • Dimensionality Reduction (3)
  • Knowledge Base (44)
    • Python (27)
    • Statistics (6)
May 2025
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  
« Oct    

We are on

FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS

Subscribe

© 2025 Beyond Knowledge Innovation
FacebookTwitterLinkedinYouTubeGitHubSubscribeEmailRSS