最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

svn status powershell non-ascii characters problem - Stack Overflow

programmeradmin0浏览0评论

I have a weird thing going on in my powershell-script using svn commands. Following is an Powershell-Example Script:

    $svnOutput = svn status
    Write-Host "Output when saved in a variable"
    $svnOutput
    Write-Host "Direct Output"
    svn status

If I run this script (within the powershell-console), I get two different outputs, if one of the files have non-ascii-characters in the name (in my examples umlauts like üöä). This is the output

    Output when saved in a variable
    ?    Test_���.txt
    Direct Output
    ?    Test_äöü.txt

I am working on a Windows Server 2022 with a VisualSVNServer Version 5.4.1. I already tested following ideas:

    chcp 65001
    $svnOutput = svn status
    $svnOutput

    $OutputEncoding = [System.Text.Encoding]::UTF8
    [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
    $svnOutput = svn status
    $svnOutput

    $svnOutput = & svn status | Out-String -Stream
    $svnOutput

    svn status > svn_status.txt
    $svnOutput = Get-Content -Path "svn_status.txt" -Encoding UTF8
    $svnOutput

    $svnOutput = & svn status | Out-String -Stream
    $svnOutput

But all of them give the same error.

PS: this also happends with other commands like

    $svnOutput = svn add . --force
    $svnOutput

which results in:

    A    Test_���.txt

Typing svn add . --force in a powershell instance, or even i a script works without any issues. Hopefully someone can help me here - thanks!

I have a weird thing going on in my powershell-script using svn commands. Following is an Powershell-Example Script:

    $svnOutput = svn status
    Write-Host "Output when saved in a variable"
    $svnOutput
    Write-Host "Direct Output"
    svn status

If I run this script (within the powershell-console), I get two different outputs, if one of the files have non-ascii-characters in the name (in my examples umlauts like üöä). This is the output

    Output when saved in a variable
    ?    Test_���.txt
    Direct Output
    ?    Test_äöü.txt

I am working on a Windows Server 2022 with a VisualSVNServer Version 5.4.1. I already tested following ideas:

    chcp 65001
    $svnOutput = svn status
    $svnOutput

    $OutputEncoding = [System.Text.Encoding]::UTF8
    [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
    $svnOutput = svn status
    $svnOutput

    $svnOutput = & svn status | Out-String -Stream
    $svnOutput

    svn status > svn_status.txt
    $svnOutput = Get-Content -Path "svn_status.txt" -Encoding UTF8
    $svnOutput

    $svnOutput = & svn status | Out-String -Stream
    $svnOutput

But all of them give the same error.

PS: this also happends with other commands like

    $svnOutput = svn add . --force
    $svnOutput

which results in:

    A    Test_���.txt

Typing svn add . --force in a powershell instance, or even i a script works without any issues. Hopefully someone can help me here - thanks!

Share Improve this question edited Feb 4 at 18:30 mklement0 441k68 gold badges703 silver badges920 bronze badges asked Jan 30 at 16:24 SamSam 681 gold badge2 silver badges15 bronze badges 6
  • 1 Try: $OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding – iRon Commented Jan 30 at 17:26
  • Your first symptom is the inverse of what normally happens if [Console]::OutputEncoding doesn't match the actual output encoding: normally, direct-to-console output renders fine, but capturing in a variable reveals an encoding mismatch. Also, what do you mean by "in the normal powershell"? – mklement0 Commented Jan 30 at 18:02
  • @mklement0 by "in the normal powershell" I mean just typing the command in a powershell instance. But even when I type $svnOutput = svn status and afterwards output via typing $svnOutput in the powershell instace, I get the same sympton as via a script. – Sam Commented Jan 31 at 9:58
  • @iRon unfortunately, that doesn't work, neither in the shell nor in the script. – Sam Commented Jan 31 at 18:14
  • 1 Glad to hear it helped, @Sam. Not restoring the original encoding is a problem if, later in the same session, you run a different console application that emits OEM-encoded output, because PowerShell will then misinterpret that – mklement0 Commented Feb 7 at 12:36
 |  Show 1 more comment

1 Answer 1

Reset to default 2 +100

The SVN documentation states (emphasis added):

The default character encoding is derived from your operating system's native locale.

This is in the context of the --encoding parameter, which is documented as overriding the default encoding on submitting information ("your commit message"), but it seemingly (and sensibly) also applies when retrieving information.

On Windows, the native locale is the so-called legacy system locale, aka language for non-Unicode programs, and it determines two encodings, via Windows code pages: the OEM code page (wich may be, e.g., CP437 or CP850) - typically used by console (terminal) applications - and the ANSI code page (e.g., Windows-1252 or Windows-1251) - typically used by GUI applications.

While the SVN documentation doesn't spell out which of these two code pages the svn utility uses, per your own feedback it seems to be the system's active ANSI code page (which, as noted, is unusual, because console applications by convention use the OEM code page; python is similarly unusual).


PowerShell consoles on Windows use the OEM code page by default, as reflected in [Console]::OutputEncoding.

Thus, in order for PowerShell to interpret (decode) ANSI output correctly,[1] [Console]::OutputEncoding must be (temporarily) set to the system's active ANSI code page, as follows:

& {
  # Temporarily change the expected output encoding to the ANSI code page.
  $prevEnc = [Console]::OutputEncoding
  [Console]::OutputEncoding = 
    [Text.Encoding]::GetEncoding(
      [int] (Get-ItemPropertyValue HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP)
    )

  svn status

  # Restore the original encoding.
  Console]::OutputEncoding = $prevEnc

}

Note that (non-CJK) ANSI encodings are fixed single-byte encodings and therefore limited to 256 characters. If you need full Unicode support, use --encoding utf8 in your svn call and set [Console]::OutputEncoding = [Text.UTF8Encoding]::new()[2]

See also:

  • This answer provides background information on how character encoding comes into play when PowerShell talks to external programs.

[1] Note that decoding, i.e. converting an external program's raw byte output into .NET strings (as used by PowerShell) based on a character encoding, only comes into play when an external program's output is either captured (in a variable), relayed, or redirected. When printing directly to the display, encoding problems usually do not surface, because many CLIs use the Unicode-capable WriteConsoleW WinAPI function for that.

[2] Assuming you have administrative privileges, another option is use UTF-8 as part of your system locale, which sets both the OEM and the ANSI code page to 65001, i.e. UTF-8. However, doing so has far-reaching consequences: see this answer.

发布评论

评论列表(0)

  1. 暂无评论