最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

beautifulsoup - How can I update the src attribute of an <img> tag in a Markdown file using Python? - Stack Overfl

programmeradmin2浏览0评论

I have a Markdown file README.md that contains HTML elements, such as an <img> tag with attributes id and src. I want to update the attributes of this HTML element programmatically using Python.

For example, I want to update the src attribute of a tag with id="updatable". I’ve successfully extracted the src attribute using BeautifulSoup, but I’m stuck when trying to update it.

Here’s my current approach:

import markdown
from bs4 import BeautifulSoup
import json
import chardet


def get_encoding_type(file_path):
    with open(file_path, 'rb') as f:
        sample = f.read(1024)
        cur_encoding = chardet.detect(sample)['encoding']
        return cur_encoding

with open("README.md", "r", encoding = get_encoding_type("README.md"),errors='ignore') as f:
    
    markdown_content = f.read()


html_content = markdown.markdown(markdown_content)

parser = BeautifulSoup(html_content, "html.parser")

img_id = parser.find("img",{'id':'updatable'})

try:
    img_source = img_id['src']
    print(img_source)

except:
    print("No image found")

So far, I’ve managed to extract the src attribute, but I’m unsure how to update it.

I have a Markdown file README.md that contains HTML elements, such as an <img> tag with attributes id and src. I want to update the attributes of this HTML element programmatically using Python.

For example, I want to update the src attribute of a tag with id="updatable". I’ve successfully extracted the src attribute using BeautifulSoup, but I’m stuck when trying to update it.

Here’s my current approach:

import markdown
from bs4 import BeautifulSoup
import json
import chardet


def get_encoding_type(file_path):
    with open(file_path, 'rb') as f:
        sample = f.read(1024)
        cur_encoding = chardet.detect(sample)['encoding']
        return cur_encoding

with open("README.md", "r", encoding = get_encoding_type("README.md"),errors='ignore') as f:
    
    markdown_content = f.read()


html_content = markdown.markdown(markdown_content)

parser = BeautifulSoup(html_content, "html.parser")

img_id = parser.find("img",{'id':'updatable'})

try:
    img_source = img_id['src']
    print(img_source)

except:
    print("No image found")

So far, I’ve managed to extract the src attribute, but I’m unsure how to update it.

Share Improve this question edited Jan 19 at 16:50 CPlus 4,93045 gold badges30 silver badges73 bronze badges asked Jan 18 at 15:41 rakin235rakin235 10111 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

You could simply assign the new value to the src attribute:

soup.find("img",{'id':'updatable'})['src'] = 'new_value'

Example:

from bs4 import BeautifulSoup

html = '<html><body><img id="updatable" src="old_value"/></body></html>'
soup = BeautifulSoup(html)

img = soup.find("img",{'id':'updatable'})
img['src'] = img['src'].replace('old','new')

print(soup)

Output:

<html><body><img id="updatable" src="new_value"/></body></html>

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论