Thursday, July 7, 2016

Python -- Pandas -- Add a new level to the column in DataFrame



Here is the situation, we have a data frame df

          X         Y         Z
0  0.738596  0.852906  0.333422
1  0.820456  0.014704  0.233935
2  0.118291  0.714536  0.176275
3  0.032417  0.819386  0.949590
4  0.739559  0.923865  0.791574

And for some reason, we want to add a new level to the columns. Unlike adding new level to index, it seems that there is no available function in Pandas to handle this problem.

We can of course do it "manually", something like this:

df.columns = pd.MultiIndex.from_arrays(np.vstack((np.asarray(['test'] * df.shape[1]), df.columns.tolist())))

A more general function is

def add_new_level(index, new_level):
    arrays = np.asarray(index.get_values().tolist()).T
    new_arrays = np.vstack((new_level, arrays))
    return pd.MultiIndex.from_arrays(new_arrays)

(not sure if this is the simplest one, have some doubts)

and the data frame becomes:
       test                  
          X         Y         Z
0  0.738596  0.852906  0.333422
1  0.820456  0.014704  0.233935
2  0.118291  0.714536  0.176275
3  0.032417  0.819386  0.949590
4  0.739559  0.923865  0.791574

or we can play some trick like this one:

df = pd.concat([df], axis=1, keys=['test'])