从Python中的字符串中删除不需要的字符
问题描述:
我有一些字符串,我想从中删除一些不需要的字符。 例如:Adam'sApple ----> AdamsApple
。(不区分大小写) 有人可以帮助我,我需要最快的方式来做到这一点,因为我有两千万条必须打磨的记录。 感谢从Python中的字符串中删除不需要的字符
答
一个简单的方法:
>>> s = "Adam'sApple"
>>> x = s.replace("'", "")
>>> print x
'AdamsApple'
...或者看看regex substitutions。
答
尝试:
"Adam'sApple".replace("'", '')
一步,用什么代替多个字符:
import re
print re.sub(r'''['"x]''', '', '''a'"xb''')
产量:
ab
答
str.replace("'","");
答
中的第二个参数的任何字符d方法是translate eleted:
>>> "Adam's Apple!".translate(None,"'!")
'Adams Apple'
注:翻译需要Python 2.6或更高使用无为第一个参数,否则必须是长度的翻译字符串256。string.maketrans(“”,“”)可以用于替代2.6以前版本没有。
答
正如已经多次指出的那样,您必须使用replace
或正则表达式(尽管您很可能不需要regexes),但是如果您还必须确保生成的字符串是纯ASCII(不包含时髦的字符,如E,O,μ,自动曝光和φ),你终于可以做
>>> u'(like é, ò, µ, æ or φ)'.encode('ascii', 'ignore')
'(like , , , or)'
答
这里是一个函数,删除所有的刺激性ASCII字符,唯一例外的是“&”,这是换成“和”。我用它来警察一个文件系统,并确保所有文件都符合我坚持每个人使用的文件命名方案。
def cleanString(incomingString):
newstring = incomingString
newstring = newstring.replace("!","")
newstring = newstring.replace("@","")
newstring = newstring.replace("#","")
newstring = newstring.replace("$","")
newstring = newstring.replace("%","")
newstring = newstring.replace("^","")
newstring = newstring.replace("&","and")
newstring = newstring.replace("*","")
newstring = newstring.replace("(","")
newstring = newstring.replace(")","")
newstring = newstring.replace("+","")
newstring = newstring.replace("=","")
newstring = newstring.replace("?","")
newstring = newstring.replace("\'","")
newstring = newstring.replace("\"","")
newstring = newstring.replace("{","")
newstring = newstring.replace("}","")
newstring = newstring.replace("[","")
newstring = newstring.replace("]","")
newstring = newstring.replace("<","")
newstring = newstring.replace(">","")
newstring = newstring.replace("~","")
newstring = newstring.replace("`","")
newstring = newstring.replace(":","")
newstring = newstring.replace(";","")
newstring = newstring.replace("|","")
newstring = newstring.replace("\\","")
newstring = newstring.replace("/","")
return newstring
答
的替代方案,将在一个字符串和不想要的字符
# function that removes unwanted signs from str
#Pass the string to the function and an array ofunwanted chars
def removeSigns(str,arrayOfChars):
charFound = False
newstr = ""
for letter in str:
for char in arrayOfChars:
if letter == char:
charFound = True
break
if charFound == False:
newstr += letter
charFound = False
return newstr
你能更具体的阵列?你想删除哪些确切的字符? – Syntactic 2010-05-06 12:06:26