本文将详细介绍如何使用Python读取邮件eml文件,并提供相应的代码示例。
一、导入所需模块
首先,我们需要导入几个Python模块来处理邮件eml文件:
import email
import os
二、打开eml文件
使用email模块的message_from_file()
函数可以直接打开eml文件并将其解析成邮件消息对象:
eml_file = 'sample.eml'
with open(eml_file, 'r') as file:
eml_content = file.read()
msg = email.message_from_string(eml_content)
三、获取邮件信息
通过邮件消息对象,我们可以获取邮件的各种信息,例如发件人、收件人、主题、日期等:
from_email = msg['From']
to_email = msg['To']
subject = msg['Subject']
date = msg['Date']
四、解析邮件正文
邮件正文可能包含纯文本、HTML格式或者同时包含多种格式。我们可以使用递归的方式解析邮件正文:
def parse_body(msg):
if msg.is_multipart():
for part in msg.get_payload():
parse_body(part)
else:
content_type = msg.get_content_type()
if content_type == 'text/plain':
text_content = msg.get_payload()
# 处理纯文本正文
elif content_type == 'text/html':
html_content = msg.get_payload()
# 处理HTML正文
parse_body(msg)
五、保存附件
邮件可能包含附件,我们可以使用get_filename()
方法获取附件文件名,并使用get_payload()
方法获取附件内容:
def save_attachments(msg):
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
if filename:
# 保存附件
save_path = os.path.join('attachments', filename)
with open(save_path, 'wb') as file:
file.write(part.get_payload(decode=True))
save_attachments(msg)
六、完整代码示例
import email
import os
def parse_body(msg):
if msg.is_multipart():
for part in msg.get_payload():
parse_body(part)
else:
content_type = msg.get_content_type()
if content_type == 'text/plain':
text_content = msg.get_payload()
# 处理纯文本正文
elif content_type == 'text/html':
html_content = msg.get_payload()
# 处理HTML正文
def save_attachments(msg):
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
if filename:
# 保存附件
save_path = os.path.join('attachments', filename)
with open(save_path, 'wb') as file:
file.write(part.get_payload(decode=True))
eml_file = 'sample.eml'
with open(eml_file, 'r') as file:
eml_content = file.read()
msg = email.message_from_string(eml_content)
from_email = msg['From']
to_email = msg['To']
subject = msg['Subject']
date = msg['Date']
parse_body(msg)
save_attachments(msg)
以上就是使用Python读取邮件eml文件的方法。通过解析邮件消息对象,我们可以方便地获取邮件的各种信息,并对正文和附件进行处理。