Let's say you are using Playwright to validate some HTML that looks like this:
<span>
The time is:
<time>5:30 pm</time>
</span>
You can use this code:
page.locator('span', {has: page.locator('time')}).textContent();
to get:
The time is: 5:30 pm
But what if you just want the first part, since it won't change?
The time is:
Is there any way to get an element's text content without getting its children's text?
Currently the only solution I can e up with is get the text of both and then remove the child's text:
const parent = page.locator('span', {has: page.locator('time')});
const parentText = parent.textContent();
const child = parent.locator('time');
const childText = child.textContent();
const onlyParentText = parentText(0, parentText.length - childText.length);
...but that's a lot of JavaScript just to get a single DOM node's text.
Is there any easier way to do the above using Playwright features?
Let's say you are using Playwright to validate some HTML that looks like this:
<span>
The time is:
<time>5:30 pm</time>
</span>
You can use this code:
page.locator('span', {has: page.locator('time')}).textContent();
to get:
The time is: 5:30 pm
But what if you just want the first part, since it won't change?
The time is:
Is there any way to get an element's text content without getting its children's text?
Currently the only solution I can e up with is get the text of both and then remove the child's text:
const parent = page.locator('span', {has: page.locator('time')});
const parentText = parent.textContent();
const child = parent.locator('time');
const childText = child.textContent();
const onlyParentText = parentText(0, parentText.length - childText.length);
...but that's a lot of JavaScript just to get a single DOM node's text.
Is there any easier way to do the above using Playwright features?
Share Improve this question edited Jan 5, 2024 at 23:12 jonrsharpe 122k30 gold badges268 silver badges476 bronze badges asked Jan 5, 2024 at 22:59 machineghostmachineghost 35.8k32 gold badges173 silver badges260 bronze badges 3-
2
If you're going to assert on it and don't want the specific time to mess up your expectation, just use e.g.
/^The time is:/
or similar (or, better, have more control of what's rendered). – jonrsharpe Commented Jan 5, 2024 at 23:12 - Does this answer your question? How can I get the text of an element without children in JavaScript? – Heretic Monkey Commented Jan 6, 2024 at 0:13
-
1
@HereticMonkey As it turns out, Playwright at the present time probably has to fall back on similar code from that question, but it's not really a duplicate because this question asks about Playwright. It's conceivable (albeit unlikely) that Playwright introduces a feature to its API that does this, or that there's some other Playwright-specific approach to extracting text nodes. Since OP isn't using
.evaluate()
, that's part of the answer, so there's still a leap to make between where OP is at and the suggested dupe. – ggorlen Commented Jan 6, 2024 at 0:37
1 Answer
Reset to default 5I don't think Playwright has this built-in, so going into evaluate
is probably the best approach:
const text = await page
.locator("span", {has: page.locator("time")})
.evaluate(el => el.firstChild.textContent);
To generalize it to cases with multiple text nodes or arbitrary positioning within a parent,
const text = await page
.locator("span", {has: page.locator("time")})
.evaluate(el =>
[...el.childNodes]
.filter(e => e.nodeType === Node.TEXT_NODE)
.map(e => e.textContent)
);
Expect to trim and join text as necessary. For example:
const playwright = require("playwright"); // ^1.39.0
const html = `
<p>
a <b>ignore this</b>
</p>
<p> b <b>ignore this</b> c </p>
<p> d <b>ignore this</b> e </p>`;
let browser;
(async () => {
browser = await playwright.firefox.launch();
const page = await browser.newPage();
await page.setContent(html);
const text = await page
.locator("p")
.evaluateAll(els =>
els.map(el =>
[...el.childNodes]
.filter(
e =>
e.nodeType === Node.TEXT_NODE &&
e.textContent.trim()
)
.map(e => e.textContent.trim())
)
);
console.log(text); // => [ [ 'a' ], [ 'b', 'c' ], [ 'd', 'e' ] ]
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
As mentioned in the ments, if you're asserting in a test, probably best to use
await expect(locator).toHaveText(/^\s*The time is:/m);