使用python从字符串获取电子邮件ID的最佳方式

问题描述:

用什么方式可以从下面的字符串中使用python获得'X-Mailer-recipient:'电子邮件ID。使用python从字符串获取电子邮件ID的最佳方式

使用re?

Received: from localhost6.localdomain6 (unknown [59.92.85.188]) 
     by smtp.webfaction.com (Postfix) with ESMTP id 05B332078BD1 
     for <[email protected]>; Fri, 26 Aug 2011 04:59:36 -0500 (CDT) 
    Content-Type: text/html; charset="utf-8" 
    MIME-Version: 1.0 
    Content-Transfer-Encoding: 7bit 
    Subject: Test subject100 
    From: [email protected] 
    To: [email protected] 
    Date: Fri, 26 Aug 2011 10:01:39 -0000 
    Message-ID: <[email protected]> 
    X-Mailer-status: false 
    X-Mailer-recipient: [email protected] 

感谢

您也可以使用这样的事情:

d = """Received: from localhost6.localdomain6 (unknown [59.92.85.188]) by smtp.webfaction.com (Postfix) with ESMTP id 05B332078BD1 for <[email protected]>; Fri, 26 Aug 2011 04:59:36 -0500 (CDT) Content-Type: text/html; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Test subject100 From: [email protected] To: [email protected] Date: Fri, 26 Aug 2011 10:01:39 -0000 Message-ID: <[email protected]> X-Mailer-status: false X-Mailer-recipient: [email protected]""" 

if 'X-Mailer-recipient:' in d: 
    d.split('X-Mailer-recipient:')[1].split()[0] 
>>> [email protected] 
+0

这还包括电子邮件地址后面的任何文字。 – Spycho

+0

@Spycho - 谢谢修复 –

+0

感谢您的回答,其实X-Mailer收件人可能在字符串中间,我应该得到正确的电子邮件 – shivg

使用正则表达式X-Mailer-recipient:\s*(.*)。您可以在Python中使用正则表达式,详见here。您需要确保您不会意外地包含您正在查找的文本。例如,上面的正则表达式可以匹配所有的“X-Mailer-recipient:[email protected] BLARG BLARG BLARG”。然后您需要访问所需的捕获组。

使用email包:

from email import message_from_string 

msg = '''Received: from localhost6.localdomain6 (unknown [59.92.85.188]) 
    by smtp.webfaction.com (Postfix) with ESMTP id 05B332078BD1 
    for <[email protected]>; Fri, 26 Aug 2011 04:59:36 -0500 (CDT) 
Content-Type: text/html; charset="utf-8" 
MIME-Version: 1.0 
Content-Transfer-Encoding: 7bit 
Subject: Test subject100 
From: [email protected] 
To: [email protected] 
Date: Fri, 26 Aug 2011 10:01:39 -0000 
Message-ID: <[email protected]> 
X-Mailer-status: false 
X-Mailer-recipient: [email protected] 
''' 
mail = message_from_string(msg) 
print mail['x-mailer-recipient'] 

使用正则表达式是不是一个好主意,因为一)头名不区分大小写,B)可以有多个具有相同名称的标题,c)一个标题可以包含另一个,例如有人可能会有邮件地址“X-Mailer-recipient:@ hotmail.com”,这会混淆基于正则表达式的方法。

+0

它只返回没有 – shivg

+0

你可以做不区分大小写的正则表达式匹配 – steabert

+0

@steabert,是的,这是真的。我想我的观点是,对于电子邮件标题,有人已经处理了解析电子邮件标题时所需的所有细节信息,以便您不必构建自己的解析机制。 –