Skip to content

[ENH]: API discussion for grouped bar charts #24313

Closed
@timhoffm

Description

@timhoffm

Problem

Currently, if one wants to draw multiple categories of bars side-by-side as in https://matplotlib.org/devdocs/gallery/lines_bars_and_markers/barchart.html, one has to calculate the bar positions manually. This is really a nuissance and too low level for any user-facing interface. Actually, when asked I recommend to use pandas plotting functions for that if possible, which is really embarrasing.

There have been stalled attempts to do this (issure 10610, PR #11048). Additionally, this often gets related to a function for stacked bars #14086.

Proposed solution

I'd like to pick up this topic and come up with a reasonable API. This topic is really complex and one can easily get lost in various details. For the design procedure, I take a bottom-up appoach by starting with a basic function grouped_bar() that does only expose the minimal functionality to get the plot done. I then intended to additional parameters one by one as they fit in. - So don't be concerned that the first proposal here is quite basic.

Terminology I'll use label for the x-values, i.e. 'G1' .. 'G5' in above example, and group for the categories, i.e. 'Tea'/'Coffee'.


For now: Only vertical orientation

I'll limit the discussion here to 'vertical'. We can have a separate discussion whether we want to add an orientation parameter or make a grouped_barh. Both are technically easy and only an API design decision that's orthogonal to the rest of the API.

For now: Only grouped layout, no stacked layout

To keep things simple, I'll limit myself to grouped for now, because:

  • Stacked is somewhat simpler than grouped as you only need to insert the bottom values in multiple calls. For two bars it's just the heights of the firsts, and a cumsum for more than two.
  • We may build a separate stacked_bar() function, if the first bullet point is considered too cumbersome.
  • It's conceivable to unite both in one function as poposed in Feature: Plot multiple bars with one call #11048 and realized in DataFrame.plot.bar, but that needs careful additional consideration.

either way let's defer stacked bars to later.

Minimal API

We want to be able to rewrite https://matplotlib.org/devdocs/gallery/lines_bars_and_markers/barchart.html as

grouped_bar(labels, [tea_means, coffee_means], group_labels=['Tea', 'Coffee']) 

Thus, the minimal API is:

def grouped_bar(x, heights, *, group_labels=None):
    """
    Parameters
    -----------
    x : array-like of str
         The labels.
    heights : list of array-like:
         An iterable of array-like: The iteration runs over the groups.
         Each individual array-like is the list of label values for that group.
    group_labels : array-like of str, optional
         The labels of the data groups.
    """

I'll soon expand on the minimal API, answering a lot of questions from #11048 (comment). But before that, please speak up in case you have fundamental concerns with adding such functionality at all or with the bottom-up design approach. OTOH if you think this is worth pursuing, please give a 👍.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions