最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to remove text inside a .jbeam text file without using json? - Stack Overflow

programmeradmin3浏览0评论

I cannot use a standard JSON parser because I'm working with .jbeam files, which are based on JSON but follow a custom format used internally by BeamNG. These files contain unique formatting rules that cause standard JSON parsers to fail, such as support for comments and the omission of commas between elements, which would typically result in errors in a regular JSON parser.

The task I'm attempting to accomplish is relatively simple: replace the nodes property with some new text in a specific format, while preserving the overall formatting of the original file (which is quite large, though I’m only using a simplified example here). However, using a standard JSON parser alters the formatting of other properties in the file as well. I attempted the solution outlined in this thread, but it quickly became cumbersome, as I had to wrap everything to preserve the formatting of the file, which doesn’t make sense for such a straightforward task. The original formatting is important because without it you will have a nightmare reading .jbeam files as they would increase in number of lines, dramatically decreasing readability.

After hours of struggling with various methods, I ultimately decided to take the simplest approach by using basic text replacement. However, even for simple operations like removing text, I'm encountering issues. I know this task could be easily handled with a bash script, but I’m curious why it's proving to be so difficult with Python. How can I efficiently remove the contents of the nodes property in this example using Python?

sample_jbeam = '''
{
    "partname": {
        "refNodes": [
            ["ref:", "back:", "left:", "up:", "leftCorner:", "rightCorner:"],
            ["ref", "", "", "", "", ""]
        ],
        "nodes": [
            ["id", "posX", "posY", "posZ"],
            ["ref", 0, 0, 0],
            ["b1", 1.0, 1.0, 1.0],
        ],
        "beams": [
        ],
    }
}
'''

I want to remove the contents inside "nodes" so I that I'm left with:

{
    "partname": {
        "refNodes": [
            ["ref:", "back:", "left:", "up:", "leftCorner:", "rightCorner:"],
            ["ref", "", "", "", "", ""]
        ],
        "nodes": [],
        "beams": [
        ],
    }
}

Since I cannot use the Python's json loader because jbeam is not really json, and besides json will change the formatting of the original file.

Failed Attempt 1: RegEx

I tried RegEx but it doesn't work:


pattern = r'("nodes":\s*)\[.*?\]'
# cleaned_text = re.sub(pattern, r'\1[]', sample_jbeam, flags=re.DOTALL)
cleaned_text = re.sub(r'"nodes": \[.*?\]', '"nodes": []', text, flags=re.DOTALL)
print(cleaned_text)

Failed Attempt 2: io.StringIO

I've tried using io.StringIO() as in my answer post but after I changed the jbeam formatting it stopped working.

Failed Attempt 3: json Preprocessor

I tried using this Amazing Json preprocessor which works really well in making the jbeam valid json but unfortunately the original formatting is also changed.

I cannot use a standard JSON parser because I'm working with .jbeam files, which are based on JSON but follow a custom format used internally by BeamNG. These files contain unique formatting rules that cause standard JSON parsers to fail, such as support for comments and the omission of commas between elements, which would typically result in errors in a regular JSON parser.

The task I'm attempting to accomplish is relatively simple: replace the nodes property with some new text in a specific format, while preserving the overall formatting of the original file (which is quite large, though I’m only using a simplified example here). However, using a standard JSON parser alters the formatting of other properties in the file as well. I attempted the solution outlined in this thread, but it quickly became cumbersome, as I had to wrap everything to preserve the formatting of the file, which doesn’t make sense for such a straightforward task. The original formatting is important because without it you will have a nightmare reading .jbeam files as they would increase in number of lines, dramatically decreasing readability.

After hours of struggling with various methods, I ultimately decided to take the simplest approach by using basic text replacement. However, even for simple operations like removing text, I'm encountering issues. I know this task could be easily handled with a bash script, but I’m curious why it's proving to be so difficult with Python. How can I efficiently remove the contents of the nodes property in this example using Python?

sample_jbeam = '''
{
    "partname": {
        "refNodes": [
            ["ref:", "back:", "left:", "up:", "leftCorner:", "rightCorner:"],
            ["ref", "", "", "", "", ""]
        ],
        "nodes": [
            ["id", "posX", "posY", "posZ"],
            ["ref", 0, 0, 0],
            ["b1", 1.0, 1.0, 1.0],
        ],
        "beams": [
        ],
    }
}
'''

I want to remove the contents inside "nodes" so I that I'm left with:

{
    "partname": {
        "refNodes": [
            ["ref:", "back:", "left:", "up:", "leftCorner:", "rightCorner:"],
            ["ref", "", "", "", "", ""]
        ],
        "nodes": [],
        "beams": [
        ],
    }
}

Since I cannot use the Python's json loader because jbeam is not really json, and besides json will change the formatting of the original file.

Failed Attempt 1: RegEx

I tried RegEx but it doesn't work:


pattern = r'("nodes":\s*)\[.*?\]'
# cleaned_text = re.sub(pattern, r'\1[]', sample_jbeam, flags=re.DOTALL)
cleaned_text = re.sub(r'"nodes": \[.*?\]', '"nodes": []', text, flags=re.DOTALL)
print(cleaned_text)

Failed Attempt 2: io.StringIO

I've tried using io.StringIO() as in my answer post but after I changed the jbeam formatting it stopped working.

Failed Attempt 3: json Preprocessor

I tried using this Amazing Json preprocessor which works really well in making the jbeam valid json but unfortunately the original formatting is also changed.

Share Improve this question edited Feb 18 at 15:06 TylerH 21.1k77 gold badges79 silver badges112 bronze badges asked Feb 17 at 9:03 Megan LoveMegan Love 1051 silver badge8 bronze badges 1
  • 1 Comments have been moved to chat; please do not continue the discussion here. Before posting a comment below this one, please review the purposes of comments. Comments that do not request clarification or suggest improvements usually belong as an answer, on Meta Stack Overflow, or in Stack Overflow Chat. Comments continuing discussion may be removed. – deceze Commented Feb 17 at 10:56
Add a comment  | 

1 Answer 1

Reset to default -1

Here's the solution with the help of an expert friend of mine mgerhardy to remove the contents and insert new content using io.StringIO()

import io

class JBeamProcessor:
    def __init__(self, json_data):
        self.json_data = json_data
        self.input_stream = None
        self.output_stream = io.StringIO()
        self.modified_data = json_data

    def remove_node_contents(self, key):
        self.output_stream = io.StringIO()
        self.input_stream = io.StringIO(self.modified_data)

        depth = 1
        skipping = False
        key_buffer = []
        inside_string = False

        while True:
            ch = self.input_stream.read(1)
            if not ch:
                break

            if ch == '"':
                inside_string = not inside_string

            if inside_string and depth == 0:
                key_buffer.append(ch)
                if len(key_buffer) > 255:
                    key_buffer = key_buffer[:255]

            if not inside_string and key_buffer:
                key_str = ''.join(key_buffer)
                key_buffer = []
                if key_str[1:] == key:
                    skipping = True
                    self.output_stream.write('"' + ':' + ' ')

            if ch == '[' and not inside_string:
                if skipping:
                    depth += 1
                    if depth == 1:
                        self.output_stream.write('[')
                    continue

            if ch == ']' and not inside_string:
                if depth > 0:
                    depth -= 1
                    if depth == 0:
                        skipping = False

            if not skipping:
                self.output_stream.write(ch)

        return self.output_stream.getvalue()

    def get_key_indent(self, key):
        current_pos = self.input_stream.tell()
        self.input_stream.seek(0)

        for line in self.input_stream:
            stripped_line = line.lstrip()
            if stripped_line.startswith(f'"{key}"'):
                indent = len(line) - len(stripped_line)
                self.input_stream.seek(current_pos)
                return indent

        self.input_stream.seek(current_pos) 
        return -1

    def insert_node_contents(self, key, new_contents):
        self.remove_node_contents(key)
        spaces = self.get_key_indent(key)
        indent = " " * spaces
        result = self.get_result()
        indented_contents = "\n".join(indent + line for line in new_contents.splitlines())
        result = result.replace(f'"{key}": []', f'"{key}": [\n\t{indented_contents}\n{indent}]')
        self.modified_data = result
        return result

    def get_result(self):
        return self.output_stream.getvalue()

Usage:

with open(jbeam_filepath, "r", encoding="utf-8") as f:
    existing_data_str = f.read()
    processor = JBeamProcessor(existing_data_str)
    existing_data_str = processor.insert_node_contents("nodes", "your replacement text or nodes)
发布评论

评论列表(0)

  1. 暂无评论