2 minute read

In this post, I will show how to create groupplots in Python. For that, we need to use Numpy, Pandas, and Matplotlib modules.

Initials

Let’s first import necessary modules and read the target file. Here, I was using a log file that has CPU usage over time.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.read_csv("syrupy_20220616132420.ps.log",delimiter=r"\s*")
df.columns

Creating Group

Now, let’s create group for a certain column. Here, I have used the PID column for grouping. I had a lot of records for each PID and thus the group ensures the plot to have different line for each PID.

fig, ax = plt.subplots(figsize=(8,6))
df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)

Different Scales

It’s not necessary unless you need two different scales for different PIDs. I had a few PIDs having values ranging around 90s and a few other ranging around 5.

Since, the lines in the plot look almost straight parallel lines, I had to use different scales to make the slightest changes visible.

df1 = df.loc[df['PID'].isin(['15503','15505','15509'])]
df2 = df.loc[df['PID'].isin(['14399','15507','15510','15512'])]

Add legends

The following code is adapted to my need and was found in this Stackoverflow thread.

# https://stackoverflow.com/questions/39902522/pandas-groupby-object-in-legend-on-plot

fig, ax = plt.subplots(figsize=(8,6))
# df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)

for name, group in df1.groupby('PID'):
    group.plot(x='TIME', y='CPU', ax=ax, label=name)
    
for name, group in df2.groupby('PID'):
    group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=name)

Now, only one label is shown from the right axis. I found solution in another Stackoverflow thread

Therefore, I needed to add this snippet

# source: https://stackoverflow.com/questions/60448943/secondary-y-in-pandas-histogram-plot-legend-is-gone-missing
handles,labels = [],[]
for ax in fig.axes:
    for h,l in zip(*ax.get_legend_handles_labels()):
        handles.append(h)
        labels.append(l)

and I had to hide pandas plot legend.

So, now, the entire code now looks like

fig, ax = plt.subplots(figsize=(8,6))
# df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)

for name, group in df1.groupby('PID'):
    group.plot(x='TIME', y='CPU', ax=ax, label=name, legend=False)
    
for name, group in df2.groupby('PID'):
    group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=name, legend=False)

handles,labels = [],[]
for ax in fig.axes:
    for h,l in zip(*ax.get_legend_handles_labels()):
        handles.append(h)
        labels.append(l)

plt.legend(handles,labels,loc='center right')
    
plt.show()

Now, to let the reader know that the other four PIDs are plotted based on right axis, I did this

for name, group in df2.groupby('PID'):
    group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=str(name)+"(right)", legend=False)

Full Final Code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
%matplotlib inline

# Defines the chart font style
font = {'family' : 'Times New Roman',
        'weight' : 'bold',
        'size'   : 14}

# includes the chart font style
plt.rc('font', **font)

# You can also define like this
# plt.rcParams["font.family"] = "Times New Roman"

# To define figure size
figure(num=None, figsize=(12, 6))

df = pd.read_csv("syrupy_20220616132420.ps.log",delimiter=r"\s*")
df.columns

df1 = df.loc[df['PID'].isin(['15503','15505','15509'])]
df2 = df.loc[df['PID'].isin(['14399','15507','15510','15512'])]

fig, ax = plt.subplots(figsize=(8,6))
# df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)

for name, group in df1.groupby('PID'):
    group.plot(x='TIME', y='CPU', ax=ax, label=name, legend=False, rot=45)
    
for name, group in df2.groupby('PID'):
    group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=str(name)+"(right)", legend=False, rot=45)

plt.xticks(rotation=90)
    
# https://stackoverflow.com/questions/39902522/pandas-groupby-object-in-legend-on-plot
handles,labels = [],[]
for ax in fig.axes:
    for h,l in zip(*ax.get_legend_handles_labels()):
        handles.append(h)
        labels.append(l)

# used 'bbox_to_anchor' to put the legend outside
# used `ncol` to create horizontal legendbox
plt.legend(handles,labels,bbox_to_anchor=(.5, 1.2),loc='upper center',ncol=4)

# Show the plot
# plt.show()
    
# To save the figure as pdf/png/jpg, use plt.savefig
plt.savefig('CPU_log.pdf', dpi=300)

That’s all for today! Cheers!!!

Leave a comment