Hands-on examples using pd.concat, pd.merge, and the .join() method are provided. We'll work with small, clear datasets to illustrate how each technique works.First, make sure you have Pandas imported. We'll use the conventional alias pd.import pandas as pd import numpy as np # Often used alongside PandasPreparing Sample DataFramesTo demonstrate concatenation and merging, let's create a few simple DataFrames.# DataFrames for concatenation examples df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2'], 'B': ['B0', 'B1', 'B2'], 'C': ['C0', 'C1', 'C2']}, index=[0, 1, 2]) df2 = pd.DataFrame({'A': ['A3', 'A4', 'A5'], 'B': ['B3', 'B4', 'B5'], 'C': ['C3', 'C4', 'C5']}, index=[3, 4, 5]) df3 = pd.DataFrame({'D': ['D0', 'D1', 'D2'], 'E': ['E0', 'E1', 'E2'], 'F': ['F0', 'F1', 'F2']}, index=[0, 1, 2]) # DataFrames for merging/joining examples left_df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'], 'L_val': ['L0', 'L1', 'L2', 'L3']}) right_df = pd.DataFrame({'key': ['K0', 'K1', 'K4', 'K5'], 'R_val': ['R0', 'R1', 'R4', 'R5']}) # DataFrames for index-based joining left_join_df = pd.DataFrame({'L_val': ['L0', 'L1', 'L2']}, index=pd.Index(['K0', 'K1', 'K2'], name='key')) right_join_df = pd.DataFrame({'R_val': ['R0', 'R1', 'R4']}, index=pd.Index(['K0', 'K1', 'K4'], name='key')) Let's look at our initial DataFrames:print("--- df1 ---") print(df1) print("\n--- df2 ---") print(df2) print("\n--- df3 ---") print(df3) print("\n--- left_df ---") print(left_df) print("\n--- right_df ---") print(right_df) print("\n--- left_join_df ---") print(left_join_df) print("\n--- right_join_df ---") print(right_join_df)Practice: Concatenation with pd.concatConcatenation is like stacking or gluing DataFrames together.Vertical Concatenation (Stacking Rows)This is the default behavior of pd.concat. It stacks DataFrames vertically, aligning columns by name.# Concatenate df1 and df2 vertically (axis=0 is default) vertical_concat = pd.concat([df1, df2]) print("--- Vertical Concatenation (Default) ---") print(vertical_concat)Notice how df2 is appended below df1. The indices from the original DataFrames are preserved. If index uniqueness is important or you prefer a clean 0-based index, use ignore_index=True.# Concatenate df1 and df2, ignoring original index vertical_concat_new_index = pd.concat([df1, df2], ignore_index=True) print("\n--- Vertical Concatenation (ignore_index=True) ---") print(vertical_concat_new_index)Horizontal Concatenation (Adding Columns)To combine DataFrames side-by-side based on their index, use axis=1.# Concatenate df1 and df3 horizontally (axis=1) horizontal_concat = pd.concat([df1, df3], axis=1) print("--- Horizontal Concatenation (axis=1) ---") print(horizontal_concat)Here, df1 and df3 are combined based on matching index labels (0, 1, 2). If indices didn't align perfectly, Pandas would introduce NaN values for missing matches (an outer join behavior along the index).Practice: Merging with pd.mergeMerging combines DataFrames based on the values in common columns, similar to SQL joins.Inner Merge (Default)An inner merge keeps only the rows where the key exists in both DataFrames.# Perform an inner merge on the column inner_merge = pd.merge(left_df, right_df, on='key', how='inner') # 'how=inner' is default print("--- Inner Merge ---") print(inner_merge)Only keys 'K0' and 'K1', which are present in both left_df and right_df, appear in the result.Outer MergeAn outer merge keeps all rows from both DataFrames. Where a key doesn't exist in one of the DataFrames, NaN values are introduced for the columns originating from that DataFrame.# Perform an outer merge on the column outer_merge = pd.merge(left_df, right_df, on='key', how='outer') print("\n--- Outer Merge ---") print(outer_merge)Keys 'K0', 'K1', 'K2', 'K3', 'K4', and 'K5' are all present. Notice the NaN values where keys are missing in one of the original DataFrames (e.g., 'K2' and 'K3' have NaN for R_val, 'K4' and 'K5' have NaN for L_val).Left MergeA left merge keeps all rows from the left DataFrame and includes matching rows from the right DataFrame. If a key from the left DataFrame doesn't exist in the right, NaN is used for the right DataFrame's columns.# Perform a left merge on the column left_merge = pd.merge(left_df, right_df, on='key', how='left') print("\n--- Left Merge ---") print(left_merge)All keys from left_df ('K0', 'K1', 'K2', 'K3') are present. 'K2' and 'K3' have NaN for R_val because they are not in right_df. Keys only present in right_df ('K4', 'K5') are excluded.Right MergeA right merge is the opposite of a left merge. It keeps all rows from the right DataFrame and includes matching rows from the left.# Perform a right merge on the column right_merge = pd.merge(left_df, right_df, on='key', how='right') print("\n--- Right Merge ---") print(right_merge)All keys from right_df ('K0', 'K1', 'K4', 'K5') are present. 'K4' and 'K5' have NaN for L_val. Keys only present in left_df ('K2', 'K3') are excluded.To help visualize these joins, consider the keys as sets:digraph G { rankdir=LR; node [shape=plaintext, fontsize=12]; subgraph cluster_0 { label = "Inner Join\n(Keys in Both)"; bgcolor="#e9ecef"; node [shape=circle, style=filled, fillcolor="#74c0fc", fontcolor=white]; a; b; node [shape=circle, style=solid, color="#adb5bd", fontcolor="#495057"]; c; d; e; f; a [label="K0"]; b [label="K1"]; c [label="K2"]; d [label="K3"]; e [label="K4"]; f [label="K5"]; edge [style=invis]; a -> b -> c -> d -> e -> f; } subgraph cluster_1 { label = "Left Join\n(All Keys from Left)"; bgcolor="#e9ecef"; node [shape=circle, style=filled, fillcolor="#74c0fc", fontcolor=white]; g; h; i; j; node [shape=circle, style=solid, color="#adb5bd", fontcolor="#495057"]; k; l; g [label="K0"]; h [label="K1"]; i [label="K2"]; j [label="K3"]; k [label="K4"]; l [label="K5"]; edge [style=invis]; g -> h -> i -> j -> k -> l; } subgraph cluster_2 { label = "Right Join\n(All Keys from Right)"; bgcolor="#e9ecef"; node [shape=circle, style=filled, fillcolor="#74c0fc", fontcolor=white]; m; n; o; p; node [shape=circle, style=solid, color="#adb5bd", fontcolor="#495057"]; q; r; m [label="K0"]; n [label="K1"]; o [label="K4"]; p [label="K5"]; q [label="K2"]; r [label="K3"]; edge [style=invis]; m -> n -> o -> p -> q -> r; } subgraph cluster_3 { label = "Outer Join\n(All Keys from Both)"; bgcolor="#e9ecef"; node [shape=circle, style=filled, fillcolor="#74c0fc", fontcolor=white]; s; t; u; v; w; x; s [label="K0"]; t [label="K1"]; u [label="K2"]; v [label="K3"]; w [label="K4"]; x [label="K5"]; edge [style=invis]; s -> t -> u -> v -> w -> x; } }Visual representation of different join types based on keys 'K0'-'K5'. Filled blue circles indicate keys included in the result for each join type, assuming left_df keys are K0-K3 and right_df keys are K0, K1, K4, K5.Merging on Different Column NamesIf the columns have different names in the DataFrames, use left_on and right_on.# Temporarily rename 'key' in right_df for demonstration right_df_renamed = right_df.rename(columns={'key': 'common_key'}) print("\n--- right_df_renamed ---") print(right_df_renamed) # Merge using left_on and right_on merge_diff_names = pd.merge(left_df, right_df_renamed, left_on='key', right_on='common_key', how='inner') print("\n--- Merge with Different Names (left_on, right_on) ---") print(merge_diff_names)The result includes both original columns (key from left_df and common_key from right_df_renamed). You might want to drop one afterwards.Practice: Index-Based Joining with .joinThe .join() method is a convenient way to perform merges based on the DataFrame indices. It defaults to a left join.# Perform a default (left) join on the index left_index_join = left_join_df.join(right_join_df, how='left') # 'how=left' is default print("--- Left Index Join (Default .join()) ---") print(left_index_join)This keeps all index values from left_join_df ('K0', 'K1', 'K2') and adds the corresponding R_val from right_join_df. 'K2' gets NaN for R_val as it's not in right_join_df's index.You can specify other join types using the how parameter, just like with pd.merge.# Perform an outer join on the index using .join() outer_index_join = left_join_df.join(right_join_df, how='outer') print("\n--- Outer Index Join (.join(how='outer')) ---") print(outer_index_join)This includes all index values from both DataFrames ('K0', 'K1', 'K2', 'K4'), filling missing values with NaN.You can also use .join() to join a DataFrame's index with a column in another DataFrame by setting the joining column as the index temporarily in the other DataFrame, or by using the on parameter within .join().# Join left_df (using column) with right_join_df (using its index) # We can use the 'on' parameter in join to specify the column in left_df column_index_join = left_df.join(right_join_df, on='key', how='inner') print("\n--- Joining Column to Index (right_join_df) ---") print(column_index_join)This practical session covered the primary ways to combine DataFrames in Pandas. Concatenation stacks data, while merging and joining combine data based on shared keys (in columns or indices) using different logical rules (inner, outer, left, right). Experiment with these techniques on your own datasets to solidify your understanding.