最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c# - Extraneous line breaks in Saxon XSLT output - Stack Overflow

programmeradmin0浏览0评论

I have an XML document like this (simplified for this example):

<DetailBill>
  <DetailBillInfo>
    <NoticeType>Detail Invoice</NoticeType>
    <Application>1</Application>
  </DetailBillInfo>
  <DetailBillInfo>
    <NoticeType>Detail Invoice</NoticeType>
    <Application>2</Application>
  </DetailBillInfo>
  <DetailBillInfo>
    <NoticeType>Detail Invoice</NoticeType>
    <Application>3</Application>
  </DetailBillInfo>
</DetailBill>

Using XSLT, I want to transform that into tab-delimited text like this:

NoticeType\tApplication
Detail Invoice\t1
Detail Invoice\t2
Detail Invoice\t3

Here is the stylesheet I'm using:

<xsl:stylesheet version="3.0" xmlns:xsl=";>
  <xsl:output method="text" indent="no" />
  <xsl:template match="/DetailBill">
    <xsl:text>NoticeType&#x9;Application&#xA;</xsl:text>
    <xsl:for-each select="DetailBillInfo">
      <xsl:value-of select="normalize-space(NoticeType)" />
      <xsl:text>&#x9;</xsl:text>
      <xsl:value-of select="normalize-space(Application)" />
      <xsl:text>&#xA;</xsl:text>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

I'm invoking the transformation in .NET 8.0 with SaxonCS 12.4 like this:

System.Xml.Linq.XElement xml = System.Xml.Linq.XElement.Parse("{xml}");
System.Xml.Linq.XElement stylesheet = System.Xml.Linq.XElement.Parse("{xslt}");
Saxon.Api.Processor processor = new(true);
Saxon.Api.XsltCompiler comp = processor.NewXsltCompiler();
List<Saxon.Api.Error> errs = [];
comp.ErrorReporter = errs.Add;
Saxon.Api.XsltExecutable exe = comp.Compile(stylesheet.CreateReader());
Saxon.Api.Xslt30Transformer xfrm = exe.Load30();
Saxon.Api.XdmValue output = xfrm.ApplyTemplates(xml.CreateReader());

But, the output I get looks like this:

NoticeType  Application

Detail Invoice
\t
1


Detail Invoice
\t
2


Detail Invoice
\t
3

Can anyone tell what I'm doing wrong?

Update:

Per Martin Honnen's suggestion, I tried using a Saxon.Api.Serializer, and that gave the expected output. Here is an updated C# code sample:

System.Xml.Linq.XElement xml = System.Xml.Linq.XElement.Parse("{xml}");
System.Xml.Linq.XElement stylesheet = System.Xml.Linq.XElement.Parse("{xslt}");
Saxon.Api.Processor processor = new(true);
Saxon.Api.XsltCompiler comp = processor.NewXsltCompiler();
List<Saxon.Api.Error> errs = [];
comp.ErrorReporter = errs.Add;
Saxon.Api.XsltExecutable exe = comp.Compile(stylesheet.CreateReader());
Saxon.Api.Xslt30Transformer xfrm = exe.Load30();
Saxon.Api.Serializer ser = processor.NewSerializer();
System.Text.StringBuilder sb = new();
using System.IO.StringWriter sw = new(sb);
ser.OutputWriter = sw;
xfrm.ApplyTemplates(xml.CreateReader(), ser);

Once that has run sb contains the expected output. I don't have deep enough insight into Saxon to understand why using the Serializer vs. just output.ToSting() makes a difference. I'm just glad to have it working.

I have an XML document like this (simplified for this example):

<DetailBill>
  <DetailBillInfo>
    <NoticeType>Detail Invoice</NoticeType>
    <Application>1</Application>
  </DetailBillInfo>
  <DetailBillInfo>
    <NoticeType>Detail Invoice</NoticeType>
    <Application>2</Application>
  </DetailBillInfo>
  <DetailBillInfo>
    <NoticeType>Detail Invoice</NoticeType>
    <Application>3</Application>
  </DetailBillInfo>
</DetailBill>

Using XSLT, I want to transform that into tab-delimited text like this:

NoticeType\tApplication
Detail Invoice\t1
Detail Invoice\t2
Detail Invoice\t3

Here is the stylesheet I'm using:

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3./1999/XSL/Transform">
  <xsl:output method="text" indent="no" />
  <xsl:template match="/DetailBill">
    <xsl:text>NoticeType&#x9;Application&#xA;</xsl:text>
    <xsl:for-each select="DetailBillInfo">
      <xsl:value-of select="normalize-space(NoticeType)" />
      <xsl:text>&#x9;</xsl:text>
      <xsl:value-of select="normalize-space(Application)" />
      <xsl:text>&#xA;</xsl:text>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

I'm invoking the transformation in .NET 8.0 with SaxonCS 12.4 like this:

System.Xml.Linq.XElement xml = System.Xml.Linq.XElement.Parse("{xml}");
System.Xml.Linq.XElement stylesheet = System.Xml.Linq.XElement.Parse("{xslt}");
Saxon.Api.Processor processor = new(true);
Saxon.Api.XsltCompiler comp = processor.NewXsltCompiler();
List<Saxon.Api.Error> errs = [];
comp.ErrorReporter = errs.Add;
Saxon.Api.XsltExecutable exe = comp.Compile(stylesheet.CreateReader());
Saxon.Api.Xslt30Transformer xfrm = exe.Load30();
Saxon.Api.XdmValue output = xfrm.ApplyTemplates(xml.CreateReader());

But, the output I get looks like this:

NoticeType  Application

Detail Invoice
\t
1


Detail Invoice
\t
2


Detail Invoice
\t
3

Can anyone tell what I'm doing wrong?

Update:

Per Martin Honnen's suggestion, I tried using a Saxon.Api.Serializer, and that gave the expected output. Here is an updated C# code sample:

System.Xml.Linq.XElement xml = System.Xml.Linq.XElement.Parse("{xml}");
System.Xml.Linq.XElement stylesheet = System.Xml.Linq.XElement.Parse("{xslt}");
Saxon.Api.Processor processor = new(true);
Saxon.Api.XsltCompiler comp = processor.NewXsltCompiler();
List<Saxon.Api.Error> errs = [];
comp.ErrorReporter = errs.Add;
Saxon.Api.XsltExecutable exe = comp.Compile(stylesheet.CreateReader());
Saxon.Api.Xslt30Transformer xfrm = exe.Load30();
Saxon.Api.Serializer ser = processor.NewSerializer();
System.Text.StringBuilder sb = new();
using System.IO.StringWriter sw = new(sb);
ser.OutputWriter = sw;
xfrm.ApplyTemplates(xml.CreateReader(), ser);

Once that has run sb contains the expected output. I don't have deep enough insight into Saxon to understand why using the Serializer vs. just output.ToSting() makes a difference. I'm just glad to have it working.

Share Improve this question edited Feb 1 at 22:40 JTennessen asked Feb 1 at 21:13 JTennessenJTennessen 3173 silver badges13 bronze badges 3
  • So where is the code producing that output? How do you look at the output? You have only shown an ApplyTemplates call returning you an XdmValue? Anyway, what is the all the System.Xml.Linq stuff needed for? And if you want to get text output and have Saxon control the serialization, why don't you use a Serializer as the destination of ApplyTemplates? – Martin Honnen Commented Feb 1 at 21:40
  • @MartinHonnen, not sure I understand your first question. output serializes as "NoticeType\tApplication\r\n\r\nDetail Invoice\r\n\t\r\n1\r\n\r\n\r\nDetail Invoice\r\n\t\r\n2\r\n\r\n\r\nDetail Invoice\r\n\t\r\n3". The C# is simplified sample code, so the System.Xml.Linq variables are just there to represent the XML data and XSLT stylesheet used in the transformation and to provide a working example. That having been said, I am no one's expert at this, so am open to better ways of doing it. I will try your suggestion of using a Serializer. Thanks! – JTennessen Commented Feb 1 at 22:01
  • @MartinHonnen, the Serializer did the trick. If you want to post that as an answer, I will accept it. Much appreciated! – JTennessen Commented Feb 1 at 22:40
Add a comment  | 

1 Answer 1

Reset to default 1

If you want text output where Saxon controls the serialization including the whitespace based on your <xsl:output method="text" indent="no" /> and your XSLT code then I would suggest not to use an XdmValue as the result of ApplyTemplates but instead pass a Serializer (over e.g. a StringWriter or a Stream) as the second argument to ApplyTemplates.

Then call ToString() on the StringWriter and I think you get the output you want as a string, or of course Saxon just writes to a File(Stream) if you provide a Stream to the Serializer.

As for the result of using ToString() on the XdmValue returned from ApplyTemplates, there are two main issues to understand, that that overload returns the "the raw result of applying templates to the supplied selection value, without wrapping in a document node or serializing the result". That way your result is an XdmValue, a sequence of text nodes. The ToString() call on the XdmValue then seems to output each text node (or probably each item in a sequence) on a separate line. That way an XSLT stylesheet giving you the wanted output as a raw XDM value you could output with ToString() would be e.g.

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3./1999/XSL/Transform">
  <xsl:output method="text" indent="no" />
  <xsl:template match="/DetailBill">
    <xsl:text>NoticeType&#x9;Application</xsl:text>
    <xsl:for-each select="DetailBillInfo">
      <xsl:value-of select="(NoticeType, Application)!normalize-space()" separator="&#x9;" />
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

That just one example, there are other options; however, I think using a Serializer is the right approach if you write XSLT and want Saxon to control and apply your xsl:output defined serialization options/parameters.

发布评论

评论列表(0)

  1. 暂无评论