console.log(document.getElementsByTagName('html')['0'].textContent);
console.log(document.getElementsByTagName('html')['0'].innerText);
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
<p>innnerHtml of paragraph</p>
</body>
</html>
console.log(document.getElementsByTagName('html')['0'].textContent);
console.log(document.getElementsByTagName('html')['0'].innerText);
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
<p>innnerHtml of paragraph</p>
</body>
</html>
The textContent property is printing all the text content inside the html element excluding the tags. It also prints all the white spaces and new lines. So to get the text without white spaces and new lines, I used the innerText property but it didn't print the text inside the title element and just printed the text inside the p element. Why didn't the innerText property work as I expected?
Share Improve this question edited Oct 21, 2018 at 12:20 Boann 50.1k16 gold badges124 silver badges152 bronze badges asked Oct 21, 2018 at 5:37 HashHash 5665 silver badges17 bronze badges 2- 1 it is not displayed in the content area of the web page – Jaromanda X Commented Oct 21, 2018 at 5:48
-
1
console.log(getComputedStyle(document.querySelector('title')).display)
– Jaromanda X Commented Oct 21, 2018 at 5:50
3 Answers
Reset to default 5Your below code working as it's intended behavior. I think you get confused about them. Have a look here at MDN
Couple of them :
While
textContent
gets the content of all elements, including<script>
and<style>
elements,innerText
does not, only showing human-readable elements.innerText
is aware of styling and won’t return the text of hidden elements, whereastextContent
does.
To remove white-space and new-line you can use regex replace.
// remove new-line and white space with replace
console.log(document.getElementsByTagName('html')['0'].textContent.replace(/[\n\r]+|[\s]{2,}/g, ' '));
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
<p>innnerHtml of paragraph</p>
</body>
</html>
According to MDN:
Node.innerText is a property that represents the "rendered" text content of a node and its descendants. As a getter, it approximates the text the user would get if they highlighted the contents of the element with the cursor and then copied to the clipboard.
The contents of the <title>
element aren't rendered as text content and certainly can not be highlighted or copied to clipboard. As such, it won't be returned by Node.innerText
.
Interestingly, document.getElementsByTagName('title')['0'].innerText
does return the contents of the <title>
element. Did a bit of reading on this and it's explained in the spec:
If this element is not being rendered, or if the user agent is a non-CSS user agent, then return the same value as the textContent IDL attribute on this element.
This step can produce surprising results, as when the innerText attribute is accessed on an element not being rendered, its text contents are returned, but when accessed on an element that is being rendered, all of its children that are not being rendered have their text contents ignored.
@user9218974, Every thing that is under body, suppose to be render as a webpage, and whatever the HEAD contains that is just a info and resources for the rendering.
So here, in your code HEAD contains meta
and title
which are just a info/meta for the page. So more or less you can say that your work area is the content within body
.
Now, if you want the text of your title
, you need to use 'document.title'