最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Javascript Parse HTML: Get everything inside table tag - Stack Overflow

programmeradmin3浏览0评论

If i have the http.responseText stored in a javascript variable e.g named sourcecode which contains the whole source code of the page from which i want to extract everything between the table tags into a javascript variable how do i do that? The html code looks like this:

<table border="0" width="100%" cellspacing="0" cellpadding="0" class="statusbox_ok" style="margin-top: 5px; margin-bottom: 5px">
    <tbody><tr>
        <td align="left" valign="top"><img src=".jpg" style="margin: 2px; margin-right: 10px"></td>
        <td align="left" valign="middle" width="100%">
        Your new username is Tom.   </td>
    </tr>
    </tbody></table>

I want to at least be able to extract:

<td align="left" valign="middle" width="100%">
            Your new username is Tom.   </td>

It doesn't matter if it includes everything between the tbody or whole table tags as well, but that part is crucial to be extracted into a javascript variable. How do i do this without jquery? Thanks.

If i have the http.responseText stored in a javascript variable e.g named sourcecode which contains the whole source code of the page from which i want to extract everything between the table tags into a javascript variable how do i do that? The html code looks like this:

<table border="0" width="100%" cellspacing="0" cellpadding="0" class="statusbox_ok" style="margin-top: 5px; margin-bottom: 5px">
    <tbody><tr>
        <td align="left" valign="top"><img src="http://www.10eastern./images/FoundPhotos/archives/archive118/dasdsa.jpg" style="margin: 2px; margin-right: 10px"></td>
        <td align="left" valign="middle" width="100%">
        Your new username is Tom.   </td>
    </tr>
    </tbody></table>

I want to at least be able to extract:

<td align="left" valign="middle" width="100%">
            Your new username is Tom.   </td>

It doesn't matter if it includes everything between the tbody or whole table tags as well, but that part is crucial to be extracted into a javascript variable. How do i do this without jquery? Thanks.

Share edited May 6, 2014 at 15:30 Ejaz 8,8723 gold badges38 silver badges52 bronze badges asked May 6, 2014 at 15:23 user3466601user3466601 311 silver badge7 bronze badges 10
  • Can you add an id attribute to the table/column? What is your best-case-scenario output (the column, or the table, or all the columns)? Are there other tables in the document that you want to exclude? If so, how do we know which table you want? – Sam Commented May 6, 2014 at 15:25
  • Here's something to start with: regex101./r/kB8wA9 – sshashank124 Commented May 6, 2014 at 15:25
  • @Sam Nope, i can't as the site does not belong to me. – user3466601 Commented May 6, 2014 at 15:26
  • Slightly improved: regex101./r/hB2pF9. Hopefully that will set you on the right track – sshashank124 Commented May 6, 2014 at 15:26
  • That's fine, is this the only table? Because right now I'm working on something that will just grab a table's columns, however if there are multiple tables you may get unintended results.. – Sam Commented May 6, 2014 at 15:27
 |  Show 5 more ments

1 Answer 1

Reset to default 5

Update:

Using this article, I read about DOMParser() which lets you parse a string into a DOM element with Javascript. Using .parseFromString(), I was able to parse an HTML string into a DOM element.

var html = '<html><table /></html>'; // Your source code
html = new DOMParser().parseFromString(html, "text/html");

Just make sure you update document.getElementsByTagName('table') with html.getElementsByTagName('table'), since we are now looking for tables in our parsed string not the document.

Updated JSFiddle.


I avoided using RegEx, because HTML isn't a regular language and you shouldn't use regular expressions to match it. Also, there are enough pure Javascript functions to acplish your task.

var tables = document.getElementsByTagName('table');
for(var tableIt = 0; tableIt < tables.length; tableIt++) {
    var table = tables[tableIt];
    if(table.className === 'statusbox_ok') {
        var columns = table.getElementsByTagName('td');
        for(columnIt = 0; columnIt < columns.length; columnIt++) {
            var column = columns[columnIt];
            console.log(column.innerHTML);
        }
    }
}

I looped through all of your table elements with .getElementsByTagName(). Then check the .className to make sure it is your statusbox_ok table. We once again use .getElementsByTagName() to loop through all of the columns. You can use some logic here to determine which column you want (similar to what we did with the table's class), but then I logged the HTML contents of each column with .innerHTML.

Check out this JSFiddle for a working example.

发布评论

评论列表(0)

  1. 暂无评论