删除Sub Strings熊猫,python
问题描述:
我有一所高中。我试图去掉学校名称的通用结尾。删除Sub Strings熊猫,python
in[1]:df
out[2]:
time school
1 09:00 Brown Academy
2 10:00 Covfefe High School
3 11:00 Bradley High
4 12:00 Johnson Prep
school_endings = ['Academy','Prep,'High','High School']
期望:
out[3]:
time school
1 09:00 Brown
2 10:00 Covfefe
3 11:00 Bradley
4 12:00 Johnson
答
endings = ['Academy', 'Prep', 'High', 'High School']
endings = sorted(endings, key=len, reverse=True)
df.assign(school=df.school.replace(endings, '', regex=True).str.strip())
time school
1 09:00 Brown
2 10:00 Covfefe
3 11:00 Bradley
4 12:00 Johnson
答
使用rstrip()
方法剥去从原始字符串的后不希望的字符串。 如:
mystring = "Brown Academy"
mystring.rstrip("Academy")
- >将要给你的O/P: '布朗'
答
我可能会用正则表达式替换走:
import re
df['school']=df['school'].apply(lambda x: re.sub(r'\s+((Academy)|(Prep)|(High)|(High School))$','',x))
答
使用拆分
df.school = df.school.str.split(' ').str[0]
school time
0 Brown 09:00
1 Covfefe 10:00
2 Bradley 11:00
3 Johnson 12:00