最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

microsoft graph api - Emails being parsed with extra Equals Signs scattered through the body on Linux and not Windows - Stack Ov

programmeradmin1浏览0评论

I am using MimeKit to retrieve emails from an Office 365 Exchange server. I'm using the Graph API to do this, so the code to retrieve and create the MimeMessage object is simply:

var mimeContentStream = await _Client.Users[_Username].Messages[uid].Content.GetAsync();
var mimeMessage = await MimeKit.MimeMessage.LoadAsync(mimeContentStream);

When I saved this file (to disk, database, blob storage, whatever), I WriteTo a memory stream and store the byte[]:

using (var stream = new MemoryStream())
{
    await mimeMessage.WriteToAsync(stream);
    var byteArr = stream.ToArray();
    // store byte[] somewhere
}

Anywhere that the email is retrieved from storage (disk, database, etc...) and opened in any sort of email browser or rendered, it's littered with equals signs. Sometimes they replace characters, sometimes they sit between characters.

HOWEVER, if the above is done on a Windows server, no problem. On Linux server, problem.

The extract below is just a portion of such an email (the first several lines of the text/html part of the message). For any line that does end with an equals sign, the equals sign is the 76th character in that line (followed by any line break chars).

--00000000000035163c0630224305
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8"><d=
iv dir=3D"auto"></div><br><div class=3D"gmail_quote gmail_quote_container">=
<div dir=3D"ltr" class=3D"gmail_attr">---------- Forwarded message --------=
-<br>From: <strong class=3D"gmail_sendername" dir=3D"auto">Cara Abrahamse</=
strong> <span dir=3D"auto">&lt;<a href=3D"mailto:[email protected]">=
[email protected]</a>&gt;</span><br>Date: Wed, 12 Mar 2025, 11:47<br=
>Subject: Payment details AAAA 1<br>To: <a href=3D"mailto:xxxxxxxxxxxxxxxxx=
@gmail">[email protected]</a> &lt;<a href=3D"mailto:xxxxxxxxx=
[email protected]">[email protected]</a>&gt;<br></div><br><br>

<div lang=3D"EN-US" link=3D"#467886" vlink=3D"#96607D" style=3D"word-wrap:b=
reak-word">
<div class=3D"m_-8096161862487316557WordSection1">
<p class=3D"MsoNormal"><span style=3D"color:black">&nbsp;</span><span style=
=3D"font-size:14.0pt;color:black">Hi&nbsp;Sean,<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:14.0pt;color:black">&nbsp;<=
u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:14.0pt;color:black">I hope =
this message finds you well. Thank you for using our services! Please find =
below payment details for the outstanding amount of R80 000.00 for services=
 rendered.&nbsp;<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:14.0pt;color:black">&nbsp;<=
u></u><u></u></span></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:14.0pt;color:black">Name=
:</span></b><span style=3D"font-size:14.0pt;color:black">&nbsp;Cara Abraham=
se<u></u><u></u></span></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:14.0pt;color:black">ID N=
umber</span></b><span style=3D"font-size:14.0pt;color:black;background:whit=
e">: 7209020000000</span><span style=3D"font-size:14.0pt;color:black"><u></=
u><u></u></span></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:14.0pt;color:black">Bank=

I think that this is a line break / format / encoding issue somewhere? I've seen reference to the 'quoted-printable' transfer encoding being key to this (maybe). But I haven't been able to work out where I need to fix this. Is it in the parsing? Or the writing? Or both? Or something in the way that I'm retrieving the mail via the Graph API?

I don't know if this is truly a Windows/Linux issue, or something else - the operating system is the only difference between instances of this working and it not working at the moment.

I haven't managed to find a set of ParserOptions or FormatOptions that resolve the issue on Linux environments. Have found several issues with these "equals signs" around the web, but haven't managed to find one that led me to a solution here.

I am using MimeKit to retrieve emails from an Office 365 Exchange server. I'm using the Graph API to do this, so the code to retrieve and create the MimeMessage object is simply:

var mimeContentStream = await _Client.Users[_Username].Messages[uid].Content.GetAsync();
var mimeMessage = await MimeKit.MimeMessage.LoadAsync(mimeContentStream);

When I saved this file (to disk, database, blob storage, whatever), I WriteTo a memory stream and store the byte[]:

using (var stream = new MemoryStream())
{
    await mimeMessage.WriteToAsync(stream);
    var byteArr = stream.ToArray();
    // store byte[] somewhere
}

Anywhere that the email is retrieved from storage (disk, database, etc...) and opened in any sort of email browser or rendered, it's littered with equals signs. Sometimes they replace characters, sometimes they sit between characters.

HOWEVER, if the above is done on a Windows server, no problem. On Linux server, problem.

The extract below is just a portion of such an email (the first several lines of the text/html part of the message). For any line that does end with an equals sign, the equals sign is the 76th character in that line (followed by any line break chars).

--00000000000035163c0630224305
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8"><d=
iv dir=3D"auto"></div><br><div class=3D"gmail_quote gmail_quote_container">=
<div dir=3D"ltr" class=3D"gmail_attr">---------- Forwarded message --------=
-<br>From: <strong class=3D"gmail_sendername" dir=3D"auto">Cara Abrahamse</=
strong> <span dir=3D"auto">&lt;<a href=3D"mailto:[email protected]">=
[email protected]</a>&gt;</span><br>Date: Wed, 12 Mar 2025, 11:47<br=
>Subject: Payment details AAAA 1<br>To: <a href=3D"mailto:xxxxxxxxxxxxxxxxx=
@gmail">[email protected]</a> &lt;<a href=3D"mailto:xxxxxxxxx=
[email protected]">[email protected]</a>&gt;<br></div><br><br>

<div lang=3D"EN-US" link=3D"#467886" vlink=3D"#96607D" style=3D"word-wrap:b=
reak-word">
<div class=3D"m_-8096161862487316557WordSection1">
<p class=3D"MsoNormal"><span style=3D"color:black">&nbsp;</span><span style=
=3D"font-size:14.0pt;color:black">Hi&nbsp;Sean,<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:14.0pt;color:black">&nbsp;<=
u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:14.0pt;color:black">I hope =
this message finds you well. Thank you for using our services! Please find =
below payment details for the outstanding amount of R80 000.00 for services=
 rendered.&nbsp;<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:14.0pt;color:black">&nbsp;<=
u></u><u></u></span></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:14.0pt;color:black">Name=
:</span></b><span style=3D"font-size:14.0pt;color:black">&nbsp;Cara Abraham=
se<u></u><u></u></span></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:14.0pt;color:black">ID N=
umber</span></b><span style=3D"font-size:14.0pt;color:black;background:whit=
e">: 7209020000000</span><span style=3D"font-size:14.0pt;color:black"><u></=
u><u></u></span></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:14.0pt;color:black">Bank=

I think that this is a line break / format / encoding issue somewhere? I've seen reference to the 'quoted-printable' transfer encoding being key to this (maybe). But I haven't been able to work out where I need to fix this. Is it in the parsing? Or the writing? Or both? Or something in the way that I'm retrieving the mail via the Graph API?

I don't know if this is truly a Windows/Linux issue, or something else - the operating system is the only difference between instances of this working and it not working at the moment.

I haven't managed to find a set of ParserOptions or FormatOptions that resolve the issue on Linux environments. Have found several issues with these "equals signs" around the web, but haven't managed to find one that led me to a solution here.

Share Improve this question edited Mar 13 at 14:43 pcbulldozer asked Mar 12 at 12:37 pcbulldozerpcbulldozer 2351 silver badge13 bronze badges 4
  • Can you post an example of what MimeKit saves when the output contains the "scattered '=' signs"? Try saving the mimeContentStream to disk so that you can look at the original raw message content before MimeKit parses it. Does it have the '=' signs? It's important to see EXACTLY where the = signs are to determine if they are part of an encoding or not. – jstedfast Commented Mar 13 at 0:17
  • Thanks @jstedfast, I have added a small sample (not the entire mail). The equals is (apparently) the 76th character in each line of the raw text, so perhaps something between the line length (i read the unix history article) and quoted-printable transfer encoding? Just not sure how to configure / use MimeKit to parse/save this properly. – pcbulldozer Commented Mar 13 at 14:46
  • Yes, that is the result of quoted-printable encoding. I'm not sure why this would only be happening on Linux and not Windows, but MimeKit won't change the encoding of a message while it is parsing it or re-writing it to disk (in fact, I go to great lengths to make sure that when MimeKit writes the message back out, it is byte-for-byte exactly the same as the input - there are a few edge cases where this isn't quite right, but it will never decide to encode an entire MIME part with quoted-printable). – jstedfast Commented Mar 13 at 14:50
  • Thanks for your help - it got me closer to finding something that worked. I don't understand the problem properly, but did find a way through it. Appreciated! – pcbulldozer Commented Mar 14 at 13:14
Add a comment  | 

2 Answers 2

Reset to default 1

MimeKit will not change the Content-Transfer-Encoding of a MimePart that it parses or writes to disk to quoted-printable or any other encoding. It will always save the message back out as close to the original byte-for-byte data stream that it parsed (short of any bugs that I'm aware of that sometimes add an extra newline to the end of the message output).

This leaves the Graph API as the source of the quoted-printable encoding unless the message was using quoted-printable when it was originally sent by a client.

Based on what you've said about downloading the exact same message on Linux and Windows and it using a different encoding on each OS, my only conclusion is that the Graph API must be requesting this encoding in the HTTP request somehow depending on which OS the code is running on.

Since the Graph API likely uses HTTPS, I would probably recommend trying to get an HTTP/S trace using a local proxy such as Charles Proxy or maybe Fiddler(?) to try and figure out what the difference is between the HTTP requests coming from each OS to the Graph API endpoints and to verify that the responses are indeed different depending on the client OS.

You could also verify response differences by just saving the mimeContentStream to disk and then comparing the results.

Depending on how your code works, it may also be worth considering saving the original mimeContentStream to your database instead of parsing and then re-writing the message to a MemoryStream. As much as I'd love to claim that MimeKit is perfect, there are a few bugs that I'm working on trying to fix that can sometimes add an extra newline to the end of a MimeMessage or MimePart. These edge cases are limited to instances where the original MIME stream was malformed in some way (e.g. missing a newline at the end of the message stream), but if you want to be able to guarantee perfection, I would recommend saving the raw/original MIME streams.

I don't know if I'll ever be good enough to understand the real reason for this, but I did find a solution to my problem.

As I write this, I realise that I'm also using MimeKit v4.8.0 while v4.11.0 is available. Not yet tested on v4.11.0

Essentially the problem seems to occur when:

  • Email stream is read into a new MimeMessage

  • WriteTo is called on that MimeMessage to write to a new stream / byte array

  • A new MimeMessage is created from this new stream / byte array

The solution (in my case at least) was to store the original byte array only, and not to store or reuse the output of the MimeMessage.WriteTo() method.

But again, I only hit this problem on a Linux box (Windows seemed fine), and the ONLY difference I could find (on a byte-for-byte basis) between the mails that worked and didn't work was the line endings (\r\n worked; \n did not work).

The thing was that causing all of those weird equals signs was some sort of off-by-one issue when handling the line breaks and/or line lengths.

For example, given the below 2-line snip of "quote-printable" section of the message (have added the \r\n characters for clarity):

<div lang=3D"EN-US" link=3D"#467886" vlink=3D"#96607D" style=3D"word-wrap:b=\r\n
reak-word">

this SHOULD be extracted / translate to:

<div lang=3D"EN-US" link=3D"#467886" vlink=3D"#96607D" style=3D"word-wrap:break-word">

but INSTEAD was getting extracted / translated to:

<div lang=3D"EN-US" link=3D"#467886" vlink=3D"#96607D" style=3D"word-wrap:b=eak-word">

In other words, it was keeping the "=" at the end of each line, and dropping the first character from the subsequent line. This obvious broke the HTML structure, and if it was actual content that was spanning the line break, you'd see characters being "replaced" by "=" signs.

For good measure, some commented code that illustrates the issue I was seeing:

// Gets the message stream from Microsoft Graph API
var mimeContentStream = await _Client.Users[_Username].Messages[uid].Content.GetAsync();

// Loads the message stream into a MimeMessage object
// write this mimeMessage to disk / view in email client, and I get the correct body
var mimeMessage = await MimeKit.MimeMessage.LoadAsync(mimeContentStream);

// writes the MimeMessage object to a byte array
byte[] mimeMessageArr;
using (var ms = new MemoryStream())
{
    mimeMessage.WriteTo(ms);
    mimeMessageArr = ms.ToArray();
}

// Loads the byte array into a new MimeMessage object
MimeMessage mimeMessageAgain;
using (var ms = new MemoryStream(mimeMessageArr))
{
    mimeMessageAgain = await MimeKit.MimeMessage.LoadAsync(ms);
}

// write mimeMessageAgain to disk / view in email client, and I get lots of EQUALS signs interspersed with the (mis-formatted) body

I don't think the fact that I was retrieving this from a Microsoft Graph API call makes the difference. I have no idea why Linux/Windows would make a difference. All I know is that after 2 days I can finally sleep.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论