最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Remove Namespace from xml Python ISSUE - Stack Overflow

programmeradmin2浏览0评论

Good afternoon everyone, I have a problem with a Python script. I need to remove the namespaces from the following input:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<enviOperacaoRessarcimento xmlns=";>
  <versao>2.00</versao>
  <dadosDeclaracao>
  <cnpjRaiz>0000000</cnpjRaiz>

The problem is that the expected output needs to keep the:

xmlns=";

and the result is:

<?xml version='1.0' encoding='utf-8'?>
<enviOperacaoRessarcimento>  
  <versao>2.00</versao>  
  <dadosDeclaracao>
  <cnpjRaiz>0000000</cnpjRaiz>  

In other words, when I remove the namespaces, I also remove the xmlns="; the script is as follows:

def remove_namespace(xml_str):
    #xml_str = re.sub(r' xmlns="[^"]+"', '', xml_str, count=1)
    xml_str = re.sub(r'ns0:|ns0','', xml_str, count=1)
    return xml_str
 

for index, row in df.iterrows():
    try:
        # Remove namespaces from the XML dat
        clean_xml = remove_namespace(row['xml'])
   

    # Parse the cleaned XML data
    root = ET.fromstring(clean_xml)
    print(root)
    
    # Write the XML data to the specified output file
    tree = ET.ElementTree(root)
    tree.write(row['fullpath'], xml_declaration=True, encoding='utf-8', method="xml")
    print(root)
    
    # Update status column and print message to Alteryx result window
    df.at[index, 'status'] = 'Successful'
    
except Exception as e:
    # Update status column with error message
    df.at[index, 'status'] = f'error: {str(e)}'

Could someone help me?

发布评论

评论列表(0)

  1. 暂无评论