How to Plot Group-plots in Python
In this post, I will show how to create groupplots in Python. For that, we need to use Numpy, Pandas, and Matplotlib modules.
Let’s first import necessary modules and read the target file. Here, I was using a log file that has CPU usage over time.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv("",delimiter=r"\s*")
Creating Group
Now, let’s create group for a certain column. Here, I have used the PID column for grouping. I had a lot of records for each PID and thus the group ensures the plot to have different line for each PID.
fig, ax = plt.subplots(figsize=(8,6))
df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)
Different Scales
It’s not necessary unless you need two different scales for different PIDs. I had a few PIDs having values ranging around 90s and a few other ranging around 5.
Since, the lines in the plot look almost straight parallel lines, I had to use different scales to make the slightest changes visible.
df1 = df.loc[df['PID'].isin(['15503','15505','15509'])]
df2 = df.loc[df['PID'].isin(['14399','15507','15510','15512'])]
Add legends
The following code is adapted to my need and was found in this Stackoverflow thread.
fig, ax = plt.subplots(figsize=(8,6))
# df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)
for name, group in df1.groupby('PID'):
group.plot(x='TIME', y='CPU', ax=ax, label=name)
for name, group in df2.groupby('PID'):
group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=name)
Now, only one label is shown from the right axis. I found solution in another Stackoverflow thread
Therefore, I needed to add this snippet
# source:
handles,labels = [],[]
for ax in fig.axes:
for h,l in zip(*ax.get_legend_handles_labels()):
and I had to hide pandas plot legend
So, now, the entire code now looks like
fig, ax = plt.subplots(figsize=(8,6))
# df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)
for name, group in df1.groupby('PID'):
group.plot(x='TIME', y='CPU', ax=ax, label=name, legend=False)
for name, group in df2.groupby('PID'):
group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=name, legend=False)
handles,labels = [],[]
for ax in fig.axes:
for h,l in zip(*ax.get_legend_handles_labels()):
plt.legend(handles,labels,loc='center right')
Now, to let the reader know that the other four PIDs are plotted based on right axis, I did this
for name, group in df2.groupby('PID'):
group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=str(name)+"(right)", legend=False)
Full Final Code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
%matplotlib inline
# Defines the chart font style
font = {'family' : 'Times New Roman',
'weight' : 'bold',
'size' : 14}
# includes the chart font style
plt.rc('font', **font)
# You can also define like this
# plt.rcParams[""] = "Times New Roman"
# To define figure size
figure(num=None, figsize=(12, 6))
df = pd.read_csv("",delimiter=r"\s*")
df1 = df.loc[df['PID'].isin(['15503','15505','15509'])]
df2 = df.loc[df['PID'].isin(['14399','15507','15510','15512'])]
fig, ax = plt.subplots(figsize=(8,6))
# df.groupby('PID').plot(x='TIME', y='CPU', ax=ax, label=PID)
for name, group in df1.groupby('PID'):
group.plot(x='TIME', y='CPU', ax=ax, label=name, legend=False, rot=45)
for name, group in df2.groupby('PID'):
group.plot(x='TIME', y='CPU', secondary_y=True, ax=ax, label=str(name)+"(right)", legend=False, rot=45)
handles,labels = [],[]
for ax in fig.axes:
for h,l in zip(*ax.get_legend_handles_labels()):
# used 'bbox_to_anchor' to put the legend outside
# used `ncol` to create horizontal legendbox
plt.legend(handles,labels,bbox_to_anchor=(.5, 1.2),loc='upper center',ncol=4)
# Show the plot
# To save the figure as pdf/png/jpg, use plt.savefig
plt.savefig('CPU_log.pdf', dpi=300)
That’s all for today! Cheers!!!
Leave a comment