Left Join

Author

Matt Pickard

What is a left join?

A left join keeps all the rows from the left table and inserts NULLs on the right when there is not a matching row found in the right table.

This Venn diagram illustrates which rows are included in a left join.

Note

A left join is the mirror of a right join.

How does a left join work?

A left join visits every row in the left table and then checks if there is a matching row in the right table. If there is, it combines the matching left and right row. If there is not, it places nulls on the right side.

Left Table

Right Table

Left Joined Result

Notice how the last row (id=-5, name="Rowlf the Dog") in the left table does not have a match in the right table. So, nulls are inserted into the right columns of the result.

The Data School provides a nice animation of the row by row operations for a left join.

How many records end up in the result of an inner join?

\(L =\) # of records in the left table

\(R =\) # of records in the right table

The number of records in the final left join is \(L\).

Implementing a left join in different tools

Here is how you perform a left join in R, Python, Power BI, and Excel.

R

library(dplyr)

employees %>% 
  left_join(positions, by = "muppet_id")
# A tibble: 5 × 4
  muppet_id names           role             wage_per_hour
      <dbl> <chr>           <chr>                    <dbl>
1         1 Kermit the Frog pragmatic leader          75  
2         2 Miss Piggy      diva pig                  85  
3         3 Fozzie the Bear stand-up comic            25.2
4         4 Gonzo           stunt performer           54.5
5         5 Rowlf the Dog   <NA>                      NA  

Python

import pandas as pd

pd.merge(employees, positions, how='left')
   muppet_id            names              role  wage_per_hour
0        1.0  Kermit the Frog  pragmatic leader          75.00
1        2.0       Miss Piggy          diva pig          85.00
2        3.0  Fozzie the Bear    stand-up comic          25.25
3        4.0            Gonzo   stunt performer          54.50
4        5.0    Rowlf the Dog               NaN            NaN

Power BI

In Power BI, we invoke Power Query through the Home >> Transform data menu item.

Excel

In native Excel, there is not a true left join. We can use VLOOKUP() repeatedly to “join” multiples columns in a “left join” manner.

Note

Like with Power BI, Power Query is also embedded in Excel. We could also use it in Excel (through the Data >> Get Data menu item).