如何删除多层圆括号中的文本python

问题描述：

我有一个python字符串，我需要删除括号。标准方法是使用text = re.sub(r'\([^)]*\)', '', text)，因此括号内的内容将被删除。如何删除多层圆括号中的文本python

但是，我刚刚发现一个看起来像(Data with in (Boo) And good luck)的字符串。随着我使用的正则表达式，它仍然有And good luck)部分。我知道我可以扫描整个字符串并尝试保留一个数字为(和)的计数器，并且当数字平衡时，索引(和)的位置并删除中间的内容，但是会有更好还是更清晰的方式为了做到这一点？它不需要是正则表达式，无论它会工作是多么伟大，谢谢。

有人问了那么这里预期的结果是什么，我期待：

Hi this is a test (a b (c d) e) sentence

更换后我希望它是Hi this is a test sentence，而不是Hi this is a test e) sentence

这是不可能的re模块做到这一点，但你可以用正则表达式模块，允许递归做到这一点。 https://pypi.python.org/pypi/regex –

在最坏的情况下，如果你建立一个模式来匹配最内层的括号'\（[^（）] * \），那么你可以用re模块来完成它，如果你循环替换，直到没有任何东西可以替换。但它不是一个非常优雅的方式，因为你需要多次解析字符串。 –

你对非正则表达式解决方案开放吗？ – Dan

答

通过re模块（更换最里面的括号，直到没有更多的替代办）：

import re 

s = r'Sainte Anne -(Data with in (Boo) And good luck) Charenton' 

nb_rep = 1 

while (nb_rep): 
    (s, nb_rep) = re.subn(r'\([^()]*\)', '', s) 

print(s)

随着regex module，允许递归：

import regex 

s = r'Sainte Anne -(Data with in (Boo) And good luck) Charenton' 

print(regex.sub(r'\([^()]*+(?:(?R)[^()]*)*+\)', '', s))

凡(?R)指整个模式本身。

第一个答案很美很棒。谢谢。 – JLTChiu

答

首先，我行拆分成做记号不包含括号，以便稍后将其加入新行：

line = "(Data with in (Boo) And good luck)" 
new_line = "".join(re.split(r'(?:[()])',line)) 
print (new_line) 
# 'Data with in Boo And good luck'

答

没有正则表达式...

>>> a = 'Hi this is a test (a b (c d) e) sentence' 
>>> o = ['(' == t or t == ')' for t in a] 
>>> o 
[False, False, False, False, False, False, False, False, False, False, 
False, False, False, False, False, False, False, False, True, False, False, 
False, False, False, True, False, False, False, False, True, False, False, 
True, False, False, False, False, False, False, False, False, False] 
>>> start,end=0,0 
>>> for n,i in enumerate(o): 
... if i and not start: 
... start = n 
... if i and start: 
... end = n 
... 
>>> 
>>> start 
18 
>>> end 
32 
>>> a1 = ' '.join(''.join(i for n,i in enumerate(a) if (n<start or n>end)).split()) 
>>> a1 
'Hi this is a test sentence' 
>>>

答

假设（1）总是有括号匹配;（2）我们只去掉括号，一切都在他们之间（即周围的空间周围的括号不变），以下应该工作。

它基本上是一个状态机，它维持嵌套括号的当前深度。如果（1）不是括号，并且（2）当前深度为0，我们保留字符。

没有正则表达式。没有递归。没有任何中间列表的情况下通过输入字符串单次传递。

tests = [ 
    "Hi this is a test (a b (c d) e) sentence", 
    "(Data with in (Boo) And good luck)", 
] 

delta = { 
    '(': 1, 
    ')': -1, 
} 

def remove_paren_groups(input): 
    depth = 0 

    for c in input: 
     d = delta.get(c, 0) 
     depth += d 
     if d != 0 or depth > 0: 
      continue 
     yield c 

for input in tests: 
    print ' IN: %s' % repr(input) 
    print 'OUT: %s' % repr(''.join(remove_paren_groups(input)))

输出：

IN: 'Hi this is a test (a b (c d) e) sentence' 
OUT: 'Hi this is a test sentence' 
IN: '(Data with in (Boo) And good luck)' 
OUT: ''

如何删除多层圆括号中的文本python

相关推荐