最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

rust - How to serialize an OsString in a cross-platform manner - Stack Overflow

programmeradmin2浏览0评论

I am trying to store file paths in a file. These paths arrive as a PathBuf value. Unfortunately PathBuf cannot be converted directly to a byte slice. It can be converted to an OsString, but the problem of being able to write said string to a file still exists.

There is the into_encoded_bytes() method, and its corresponding unsafe from_encoded_bytes_unchecked() method, but the safety instructions on that indicate that it may only be used with data encoded by the same rust version on the same platform. Not useful if I want to move my data file between platforms, or update the program that produces it.

The from_encoded_bytes_unchecked() method also links to module level discussion of conversions. This describes separate methods of encoding on Windows, Unix, and "Other platforms", but the section titled "All platforms" just points back to the into_encoded_bytes() and from_encoded_bytes_unchecked() methods, which do not allow for reading the data on a different platform than it was written on.

Is there any way to serialize an arbitrary file path which will allow that serialized path to be used in a cross platform manner? Other than attempting to convert to a String, and refusing to work if it's not valid UTF-8?

I am trying to store file paths in a file. These paths arrive as a PathBuf value. Unfortunately PathBuf cannot be converted directly to a byte slice. It can be converted to an OsString, but the problem of being able to write said string to a file still exists.

There is the into_encoded_bytes() method, and its corresponding unsafe from_encoded_bytes_unchecked() method, but the safety instructions on that indicate that it may only be used with data encoded by the same rust version on the same platform. Not useful if I want to move my data file between platforms, or update the program that produces it.

The from_encoded_bytes_unchecked() method also links to module level discussion of conversions. This describes separate methods of encoding on Windows, Unix, and "Other platforms", but the section titled "All platforms" just points back to the into_encoded_bytes() and from_encoded_bytes_unchecked() methods, which do not allow for reading the data on a different platform than it was written on.

Is there any way to serialize an arbitrary file path which will allow that serialized path to be used in a cross platform manner? Other than attempting to convert to a String, and refusing to work if it's not valid UTF-8?

Share Improve this question asked Feb 3 at 5:42 NickNick 2701 gold badge3 silver badges7 bronze badges 2
  • What if a Linux filename is \xF0? Does that convert to \x00F0 on Windows (ð)? But if a Windows file is named ð, shouldn't that convert to \xC3\xB0 on Linux (the UTF-8 encoding of that character)? What if both files exist on the Linux side and you "convert" both names, how do you make sure Windows keeps them separate? – trent Commented Feb 3 at 15:14
  • If your answer is - reasonably - "I don't support such ridiculous corner cases", then you have your answer. Paths - at least, arbitrary paths - are inherently non-cross-platform. – trent Commented Feb 3 at 15:16
Add a comment  | 

2 Answers 2

Reset to default 3

If you need the paths to all share the same format on all platforms (and always be supportable on all platforms; see why Path can't always just turn directly into a String), I think you may need to call Path::components on the paths and then write them with your desired formatting manually. Basically it would be a very simple microparser. You could try using write!(file, "{}", path.display()) (this will give you the same result as converting to String lossily).

The issue as I understand lies exactly in that bit about encoded bytes not being portable. You say that you want to consider switching platforms but not all Windows paths are valid Linux/MacOS paths and vica versa. Raw paths are not cross-platform things and either need to be formatted or manually parsed to and from other platforms for lossless encoding (all the meaning is included) or you could do the more common approach of just dumbing down the paths for debugging purposes (to_string_lossy) or reject paths that cannot be encoded (into_string).

I don't know if there are any edge case, but I'd just use to_str and from_str

use std::str::FromStr;

fn main() {
    let p = std::env::current_dir().unwrap().join("some.file");

    std::fs::write("./my.path", p.to_str().unwrap()).unwrap();

    let p = std::path::PathBuf::from_str(
        std::str::from_utf8(&std::fs::read("./my.path").expect("Read File Error"))
            .expect("File is not utf-8 format"),
    )
    .expect("File content is not vaild Pathbuf");

    println!("{:?}", p);
}
发布评论

评论列表(0)

  1. 暂无评论