Pandas support¶

It is convenient to use the Pandas package when dealing with numerical data, so Pint provides PintArray. A PintArray is a Pandas Extension Array, which allows Pandas to recognise the Quantity and store it in Pandas DataFrames and Series.

Basic example¶

This example will show the simplist way to use pandas with pint and the underlying objects. It’s slightly fiddly as you are not reading from a file. A more normal use case is given in Reading a csv.

First some imports

[1]:

import pandas as pd
import pint

Next, we create a DataFrame with PintArrays as columns.

[2]:

df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})
df

[2]:

	torque	angular_velocity
0	1	1
1	2	2
2	2	2
3	3	3

Operations with columns are units aware so behave as we would intuitively expect.

[3]:

df['power'] = df['torque'] * df['angular_velocity']
df

[3]:

	torque	angular_velocity	power
0	1	1	1
1	2	2	4
2	2	2	4
3	3	3	9

We can see the columns’ units in the dtypes attribute

[4]:

df.dtypes

[4]:

torque                                       pint[foot * force_pound]
angular_velocity                         pint[revolutions_per_minute]
power               pint[foot * force_pound * revolutions_per_minute]
dtype: object

Each column can be accessed as a Pandas Series

[5]:

df.power

[5]:

0    1
1    4
2    4
3    9
Name: power, dtype: pint[foot * force_pound * revolutions_per_minute]

Which contains a PintArray

[6]:

df.power.values

[6]:

PintArray([1 foot * force_pound * revolutions_per_minute,
           4 foot * force_pound * revolutions_per_minute,
           4 foot * force_pound * revolutions_per_minute,
           9 foot * force_pound * revolutions_per_minute],
          dtype='pint[foot * force_pound * revolutions_per_minute]')

The PintArray contains a Quantity

[7]:

df.power.values.quantity

[7]:

\[\begin{pmatrix}1 & 4 & 4 & 9\end{pmatrix} foot force_pound revolutions_per_minute\]

Pandas Series accessors are provided for most Quantity properties and methods, which will convert the result to a Series where possible.

[8]:

df.power.pint.units

[8]:

foot force_pound revolutions_per_minute

[9]:

df.power.pint.to("kW").values

[9]:

PintArray([0.00014198092353610376 kilowatt, 0.000567923694144415 kilowatt,
           0.000567923694144415 kilowatt, 0.0012778283118249339 kilowatt],
          dtype='pint[kilowatt]')

Reading from csv¶

Reading from files is the far more standard way to use pandas. To facilitate this, DataFrame accessors are provided to make it easy to get to PintArrays.

[10]:

import pandas as pd
import pint
import io

Here’s the contents of the csv file.

[11]:

test_data = '''speed,mech power,torque,rail pressure,fuel flow rate,fluid power
rpm,kW,N m,bar,l/min,kW
1000.0,,10.0,1000.0,10.0,
1100.0,,10.0,100000000.0,10.0,
1200.0,,10.0,1000.0,10.0,
1200.0,,10.0,1000.0,10.0,'''

Let’s read that into a DataFrame. Here io.StringIO is used in place of reading a file from disk, whereas a csv file path would typically be used and is shown commented.

[12]:

df = pd.read_csv(io.StringIO(test_data),header=[0,1])
# df = pd.read_csv("/path/to/test_data.csv",header=[0,1])
df

[12]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
	rpm	kW	N m	bar	l/min	kW
0	1000.0	NaN	10.0	1000.0	10.0	NaN
1	1100.0	NaN	10.0	100000000.0	10.0	NaN
2	1200.0	NaN	10.0	1000.0	10.0	NaN
3	1200.0	NaN	10.0	1000.0	10.0	NaN

Then use the DataFrame’s pint accessor’s quantify method to convert the columns from np.ndarrays to PintArrays, with units from the bottom column level.

[13]:

df.dtypes

[13]:

speed           rpm      float64
mech power      kW       float64
torque          N m      float64
rail pressure   bar      float64
fuel flow rate  l/min    float64
fluid power     kW       float64
dtype: object

[14]:

df_ = df.pint.quantify(level=-1)
df_

[14]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
0	1000.0	nan	10.0	1000.0	10.0	nan
1	1100.0	nan	10.0	100000000.0	10.0	nan
2	1200.0	nan	10.0	1000.0	10.0	nan
3	1200.0	nan	10.0	1000.0	10.0	nan

As previously, operations between DataFrame columns are unit aware

[15]:

df_.speed*df_.torque

[15]:

0    10000.0
1    11000.0
2    12000.0
3    12000.0
dtype: pint[meter * newton * revolutions_per_minute]

[16]:

df_

[16]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
0	1000.0	nan	10.0	1000.0	10.0	nan
1	1100.0	nan	10.0	100000000.0	10.0	nan
2	1200.0	nan	10.0	1000.0	10.0	nan
3	1200.0	nan	10.0	1000.0	10.0	nan

[17]:

df_['mech power'] = df_.speed*df_.torque
df_['fluid power'] = df_['fuel flow rate'] * df_['rail pressure']
df_

[17]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
0	1000.0	10000.0	10.0	1000.0	10.0	10000.0
1	1100.0	11000.0	10.0	100000000.0	10.0	1000000000.0
2	1200.0	12000.0	10.0	1000.0	10.0	10000.0
3	1200.0	12000.0	10.0	1000.0	10.0	10000.0

The DataFrame’s pint.dequantify method then allows us to retrieve the units information as a header row once again.

[18]:

df_.pint.dequantify()

[18]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
unit	revolutions_per_minute	meter * newton * revolutions_per_minute	meter * newton	bar	liter / minute	bar * liter / minute
0	1000.0	10000.0	10.0	1000.0	10.0	1.000000e+04
1	1100.0	11000.0	10.0	100000000.0	10.0	1.000000e+09
2	1200.0	12000.0	10.0	1000.0	10.0	1.000000e+04
3	1200.0	12000.0	10.0	1000.0	10.0	1.000000e+04

This allows for some rather powerful abilities. For example, to change single column units

[19]:

df_['fluid power'] = df_['fluid power'].pint.to("kW")
df_['mech power'] = df_['mech power'].pint.to("kW")
df_.pint.dequantify()

[19]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
unit	revolutions_per_minute	kilowatt	meter * newton	bar	liter / minute	kilowatt
0	1000.0	1.047198	10.0	1000.0	10.0	1.666667e+01
1	1100.0	1.151917	10.0	100000000.0	10.0	1.666667e+06
2	1200.0	1.256637	10.0	1000.0	10.0	1.666667e+01
3	1200.0	1.256637	10.0	1000.0	10.0	1.666667e+01

The units are harder to read than they need be, so lets change pints default format for displaying units.

[20]:

pint.PintType.ureg.default_format = "~P"
df_.pint.dequantify()

[20]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
unit	rpm	kW	N·m	bar	l/min	kW
0	1000.0	1.047198	10.0	1000.0	10.0	1.666667e+01
1	1100.0	1.151917	10.0	100000000.0	10.0	1.666667e+06
2	1200.0	1.256637	10.0	1000.0	10.0	1.666667e+01
3	1200.0	1.256637	10.0	1000.0	10.0	1.666667e+01

or the entire table’s units

[21]:

df_.pint.to_base_units().pint.dequantify()

[21]:

	speed	mech power	torque	rail pressure	fuel flow rate	fluid power
unit	rad/s	kg·m²/s³	kg·m²/s²	kg/m/s²	m³/s	kg·m²/s³
0	104.719755	1047.197551	10.0	1.000000e+08	0.000167	1.666667e+04
1	115.191731	1151.917306	10.0	1.000000e+13	0.000167	1.666667e+09
2	125.663706	1256.637061	10.0	1.000000e+08	0.000167	1.666667e+04
3	125.663706	1256.637061	10.0	1.000000e+08	0.000167	1.666667e+04

Advanced example¶

This example shows alternative ways to use pint with pandas and other features.

Start with the same imports.

[22]:

import pandas as pd
import pint

We’ll be use a shorthand for PintArray

[23]:

PA_ = pint.PintArray

And set up a unit registry and quantity shorthand.

[24]:

ureg=pint.UnitRegistry()
Q_=ureg.Quantity

Operations between PintArrays of different unit registry will not work. We can change the unit registry that will be used in creating new PintArrays to prevent this issue.

[25]:

pint.PintType.ureg = ureg

These are the possible ways to create a PintArray.

Note that pint[unit] must be used for the Series constuctor, whereas the PintArray constructor allows the unit string or object.

[26]:

df = pd.DataFrame({
        "length" : pd.Series([1,2], dtype="pint[m]"),
        "width" : PA_([2,3], dtype="pint[m]"),
        "distance" : PA_([2,3], dtype="m"),
        "height" : PA_([2,3], dtype=ureg.m),
        "depth" : PA_.from_1darray_quantity(Q_([2,3],ureg.m)),
    })
df

[26]:

	length	width	distance	height	depth
0	1	2	2	2	2
1	2	3	3	3	3

[27]:

df.length.values.units

[27]:

meter

Pandas support¶

Basic example¶

Reading from csv¶

Advanced example¶

About Pint

Other Formats

Useful Links

Table of Contents

Related Topics

This Page