@@ -3231,7 +3231,7 @@ plt.show() # Displays the plot. Also plt.sav
32313231** Table with labeled rows and columns.**
32323232
32333233``` python
3234- >> > l = pd.DataFrame([[1 , 2 ], [3 , 4 ]], index = [' a' , ' b' ], columns = [' x' , ' y' ]); l
3234+ >> > df = pd.DataFrame([[1 , 2 ], [3 , 4 ]], index = [' a' , ' b' ], columns = [' x' , ' y' ]); df
32353235 x y
32363236a 1 2
32373237b 3 4
@@ -3251,7 +3251,7 @@ b 3 4
32513251
32523252``` python
32533253< S/ DF > = < DF > [col_key/ s] # Or: <DF>.<col_key>
3254- < DF > = < DF > [row_bools ] # Keeps rows as specified by bools .
3254+ < DF > = < DF > [< S_of_bools > ] # Filters rows. For example `df[df.x > 1]` .
32553255< DF > = < DF > [< DF_of_bools > ] # Assigns NaN to items that are False in bools.
32563256```
32573257
@@ -3270,7 +3270,7 @@ b 3 4
32703270``` python
32713271< DF > = < DF > .head/ tail/ sample(< int > ) # Returns first, last, or random n rows.
32723272< DF > = < DF > .describe() # Describes columns. Also info(), corr(), shape.
3273- < DF > = < DF > .query(' <query>' ) # Filters rows with e.g. 'col_1 == val_1 and …' .
3273+ < DF > = < DF > .query(' <query>' ) # Filters rows. For example `df.query('x > 1')` .
32743274```
32753275
32763276``` python
@@ -3280,37 +3280,37 @@ plt.show() # Displays the plot. Also plt.sav
32803280
32813281#### DataFrame — Merge, Join, Concat:
32823282``` python
3283- >> > r = pd.DataFrame([[4 , 5 ], [6 , 7 ]], index = [' b' , ' c' ], columns = [' y' , ' z' ]); r
3283+ >> > df_2 = pd.DataFrame([[4 , 5 ], [6 , 7 ]], index = [' b' , ' c' ], columns = [' y' , ' z' ]); df_2
32843284 y z
32853285b 4 5
32863286c 6 7
32873287```
32883288
32893289``` text
3290- +------------------------ +---------------+------------+------------+--------------------------+
3291- | | 'outer' | 'inner' | 'left' | Description |
3292- +------------------------ +---------------+------------+------------+--------------------------+
3293- | l .merge(r, on='y', | x y z | x y z | x y z | Merges on column if 'on' |
3294- | how=…) | 0 1 2 . | 3 4 5 | 1 2 . | or 'left/right_on' are |
3295- | | 1 3 4 5 | | 3 4 5 | set, else on shared cols.|
3296- | | 2 . 6 7 | | | Uses 'inner' by default. |
3297- +------------------------ +---------------+------------+------------+--------------------------+
3298- | l .join(r, lsuffix='l', | x yl yr z | | x yl yr z | Merges on row keys. |
3299- | rsuffix='r ', | a 1 2 . . | x yl yr z | 1 2 . . | Uses 'left' by default. |
3300- | how=…) | b 3 4 4 5 | 3 4 4 5 | 3 4 4 5 | If r is a Series, it is |
3301- | | c . . 6 7 | | | treated as a column. |
3302- +------------------------ +---------------+------------+------------+--------------------------+
3303- | pd.concat([l, r ], | x y z | y | | Adds rows at the bottom. |
3304- | axis=0, | a 1 2 . | 2 | | Uses 'outer' by default. |
3305- | join=…) | b 3 4 . | 4 | | A Series is treated as a |
3306- | | b . 4 5 | 4 | | column. To add a row use |
3307- | | c . 6 7 | 6 | | pd.concat([l , DF([s])]). |
3308- +------------------------ +---------------+------------+------------+--------------------------+
3309- | pd.concat([l, r ], | x y y z | | | Adds columns at the |
3310- | axis=1, | a 1 2 . . | x y y z | | right end. Uses 'outer' |
3311- | join=…) | b 3 4 4 5 | 3 4 4 5 | | by default. A Series is |
3312- | | c . . 6 7 | | | treated as a column. |
3313- +------------------------ +---------------+------------+------------+--------------------------+
3290+ +-----------------------+---------------+------------+------------+- --------------------------+
3291+ | | 'outer' | 'inner' | 'left' | Description |
3292+ +-----------------------+---------------+------------+------------+- --------------------------+
3293+ | df .merge(df_2, | x y z | x y z | x y z | Merges on column if 'on' |
3294+ | on='y', | 0 1 2 . | 3 4 5 | 1 2 . | or 'left/right_on' are |
3295+ | how=…) | 1 3 4 5 | | 3 4 5 | set, else on shared cols. |
3296+ | | 2 . 6 7 | | | Uses 'inner' by default. |
3297+ +-----------------------+---------------+------------+------------+- --------------------------+
3298+ | df .join(df_2, | x yl yr z | | x yl yr z | Merges on row keys. |
3299+ | lsuffix='l ', | a 1 2 . . | x yl yr z | 1 2 . . | Uses 'left' by default. |
3300+ | rsuffix='r', | b 3 4 4 5 | 3 4 4 5 | 3 4 4 5 | If r is a Series, it is |
3301+ | how=…) | c . . 6 7 | | | treated as a column. |
3302+ +-----------------------+---------------+------------+------------+- --------------------------+
3303+ | pd.concat([df, df_2 ], | x y z | y | | Adds rows at the bottom. |
3304+ | axis=0, | a 1 2 . | 2 | | Uses 'outer' by default. |
3305+ | join=…) | b 3 4 . | 4 | | A Series is treated as a |
3306+ | | b . 4 5 | 4 | | column. To add a row use |
3307+ | | c . 6 7 | 6 | | pd.concat([df , DF([s])]). |
3308+ +-----------------------+---------------+------------+------------+- --------------------------+
3309+ | pd.concat([df, df_2 ], | x y y z | | | Adds columns at the |
3310+ | axis=1, | a 1 2 . . | x y y z | | right end. Uses 'outer' |
3311+ | join=…) | b 3 4 4 5 | 3 4 4 5 | | by default. A Series is |
3312+ | | c . . 6 7 | | | treated as a column. |
3313+ +-----------------------+---------------+------------+------------+- --------------------------+
33143314```
33153315
33163316#### DataFrame — Aggregate, Transform, Map:
@@ -3321,23 +3321,23 @@ c 6 7
33213321```
33223322
33233323``` text
3324- +----------------+---------------+---------------+---------------+
3325- | | 'sum' | ['sum'] | {'x': 'sum'} |
3326- +----------------+---------------+---------------+---------------+
3327- | l .apply(…) | x 4 | x y | x 4 |
3328- | l .agg(…) | y 6 | sum 4 6 | |
3329- +----------------+---------------+---------------+---------------+
3324+ +----------------- +---------------+---------------+---------------+
3325+ | | 'sum' | ['sum'] | {'x': 'sum'} |
3326+ +----------------- +---------------+---------------+---------------+
3327+ | df .apply(…) | x 4 | x y | x 4 |
3328+ | df .agg(…) | y 6 | sum 4 6 | |
3329+ +----------------- +---------------+---------------+---------------+
33303330```
33313331
33323332``` text
3333- +----------------+---------------+---------------+---------------+
3334- | | 'rank' | ['rank'] | {'x': 'rank'} |
3335- +----------------+---------------+---------------+---------------+
3336- | l .apply(…) | | x y | |
3337- | l .agg(…) | x y | rank rank | x |
3338- | l .transform(…) | a 1.0 1.0 | a 1.0 1.0 | a 1.0 |
3339- | | b 2.0 2.0 | b 2.0 2.0 | b 2.0 |
3340- +----------------+---------------+---------------+---------------+
3333+ +----------------- +---------------+---------------+---------------+
3334+ | | 'rank' | ['rank'] | {'x': 'rank'} |
3335+ +----------------- +---------------+---------------+---------------+
3336+ | df .apply(…) | | x y | |
3337+ | df .agg(…) | x y | rank rank | x |
3338+ | df .transform(…) | a 1.0 1.0 | a 1.0 1.0 | a 1.0 |
3339+ | | b 2.0 2.0 | b 2.0 2.0 | b 2.0 |
3340+ +----------------- +---------------+---------------+---------------+
33413341```
33423342* ** All methods operate on columns by default. Pass ` 'axis=1' ` to process the rows instead.**
33433343* ** Fifth result's columns are indexed with a multi-index. This means we need a tuple of column keys to specify a column: ` '<DF>.loc[row_key, (col_key_1, col_key_2)]' ` .**
0 commit comments