We wish to remove leading and trailing white space characters. We will also cover how to eliminate duplicate spaces within the string.
We wish to remove leading and trailing white space characters.
df_2 = df.assign(
col_2 = df['col_1'].str.strip()
)
Here is how this works:
str.strip()
to eliminate leading and trailing white spaces.df_2
will be a copy of the input data frame df with an added column col_2
where each value is the corresponding value of col_1
with any leading or trailing white space characters removed.Extension: Strip One Side Only
df_2 = df.assign(
col_2 = df['col_1'].str.lstrip(),
col_3 = df['col_1'].str.rstrip()
)
Here is how this works:
str.strip()
will remove white space characters on both sides of the string.str.lstrip()
or str.rstrip()
functions instead of str.strip()
.Extension: Specify Characters to Strip
df_2 = df.assign(
col_2 = df['col_1'].str.strip('_')
)
Here is how this works:
str.strip()
method is the ability to define other characters to remove should they be leading or trailing.'_'
characters.We wish to replace duplicate white space characters within the string with a single white space.
df_2 = df.assign(
col_2 = df['col_1'].str.replace('\s{2,}', ' ', regex=True)
)
Here is how this works:
str.replace()
to capture all occurrences of intermittent duplicate white spaces and replace them with a single white space."\s{2,}"
where:\s
denotes a white space character and{2, }
specifies that we wish to capture a pattern or two or more occurrences.