最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Divide HTML into pages, split long paragraphs - Stack Overflow

programmeradmin1浏览0评论

Well I'm not even sure if this can be done with my current approach. I'm trying to fit the contents of an HTML document into pages that are the size of the current viewport. I'm currently doing this by iterating through all of the document's elements and checking whether their top offset is within the current page's boundary, whenever it is not, this offset bees the start of a new page and the page boundary is set to this offset plus the viewport's height.

The problem I'm facing is that often there will be an element (a paragraph, for example) whose height is larger than the viewport itself, so even if the algorithm places this element at the start of a new page, its contents will overflow. I'm trying to find a way to split such elements in a way that the first slice will occupy the remaining part of the page.

This presents further difficulties. Even if I could find a way to determine how much of a paragraph's text still fits within the remainder of a page, and this in itself has proven to be quite difficult, I would still have the problem of the DOM no updating immediately after splitting the paragraph, which would mess up the calculation of the next page or at least force me to break the recursion which would plicate the algorithm even more.

Any suggestions on how to split a paragraph in a way that the first slice takes up the remaining space on the page are wele. This is my code so far:

EDIT: It is worth noting that this would only work on very simple HTML on which there are no absolutely positioned or floated elements. This is not a problem in my case.

var elementIndex = -1;
var pages = 1;
var pageBoundary = 0;
var pageBreaks = [];

function calculatePages() {
    //first page boundary is window height
    pageBoundary = $(window).height();
    //do calculations
    iterateElements($("body"));
    //print results
    console.log(pageBreaks);
}

function iterateElements(parent) {

    $.each($(parent).children(), function(i, e) {
        //increase current element index
        elementIndex = elementIndex + 1;
        //get element's top offset
        var offsetTop = $(e).offset().top;
        //get the last position that the element occupies
        var elementSpan = offsetTop + $(e).outerHeight();

        if ($(e).children().length == 0) { //only leaf nodes will be set as page breaks
            //element's start position is outside page boundary
            if (offsetTop >= pageBoundary) {
                //mark page start with red in order to visualize
                $(e).attr("style", "border-top: 1px solid red");

                //increase page count
                pages = pages + 1;
                //new page starts at element's top, next page boundary
                //is element's starting position plus the viewport's height
                pageBoundary = offsetTop + $(window).height();
                //store index of page break
                pageBreaks.push(elementIndex);
            }
            //element's start position is inside current page, but contents overflow
            else if (elementSpan >= pageBoundary) {
                //NO IDEA WHAT TO DO HERE
                //NEED A WAY TO SPLIT LARGE ELEMENTS
            }
        } 

        iterateElements(e);
    });
}

$(function() {
    calculatePages();
});

Well I'm not even sure if this can be done with my current approach. I'm trying to fit the contents of an HTML document into pages that are the size of the current viewport. I'm currently doing this by iterating through all of the document's elements and checking whether their top offset is within the current page's boundary, whenever it is not, this offset bees the start of a new page and the page boundary is set to this offset plus the viewport's height.

The problem I'm facing is that often there will be an element (a paragraph, for example) whose height is larger than the viewport itself, so even if the algorithm places this element at the start of a new page, its contents will overflow. I'm trying to find a way to split such elements in a way that the first slice will occupy the remaining part of the page.

This presents further difficulties. Even if I could find a way to determine how much of a paragraph's text still fits within the remainder of a page, and this in itself has proven to be quite difficult, I would still have the problem of the DOM no updating immediately after splitting the paragraph, which would mess up the calculation of the next page or at least force me to break the recursion which would plicate the algorithm even more.

Any suggestions on how to split a paragraph in a way that the first slice takes up the remaining space on the page are wele. This is my code so far:

EDIT: It is worth noting that this would only work on very simple HTML on which there are no absolutely positioned or floated elements. This is not a problem in my case.

var elementIndex = -1;
var pages = 1;
var pageBoundary = 0;
var pageBreaks = [];

function calculatePages() {
    //first page boundary is window height
    pageBoundary = $(window).height();
    //do calculations
    iterateElements($("body"));
    //print results
    console.log(pageBreaks);
}

function iterateElements(parent) {

    $.each($(parent).children(), function(i, e) {
        //increase current element index
        elementIndex = elementIndex + 1;
        //get element's top offset
        var offsetTop = $(e).offset().top;
        //get the last position that the element occupies
        var elementSpan = offsetTop + $(e).outerHeight();

        if ($(e).children().length == 0) { //only leaf nodes will be set as page breaks
            //element's start position is outside page boundary
            if (offsetTop >= pageBoundary) {
                //mark page start with red in order to visualize
                $(e).attr("style", "border-top: 1px solid red");

                //increase page count
                pages = pages + 1;
                //new page starts at element's top, next page boundary
                //is element's starting position plus the viewport's height
                pageBoundary = offsetTop + $(window).height();
                //store index of page break
                pageBreaks.push(elementIndex);
            }
            //element's start position is inside current page, but contents overflow
            else if (elementSpan >= pageBoundary) {
                //NO IDEA WHAT TO DO HERE
                //NEED A WAY TO SPLIT LARGE ELEMENTS
            }
        } 

        iterateElements(e);
    });
}

$(function() {
    calculatePages();
});
Share Improve this question edited Apr 1, 2013 at 15:53 j08691 208k32 gold badges269 silver badges280 bronze badges asked Mar 31, 2013 at 18:12 JayPeaJayPea 9,7118 gold badges45 silver badges65 bronze badges 4
  • One naive approach would be to calculate the size of 1/2 words from the paragraph and see if that fits, if so split the large paragraph into two, else split again. – Philip Whitehouse Commented Mar 31, 2013 at 23:22
  • Yes, that was my initial approach actually. The problem is that this might leave several paragraph fragments on the next page, which would alter the normal flow of the text. Also, since the DOM doesn't get updated until the function returns, there seems to be no way to check if 1/2 words fit in the page without breaking the recursion. – JayPea Commented Apr 1, 2013 at 15:38
  • If a single paragraph is larger than entire page, obviously paragraphs are going to be split. As for recursion, normally recursion is the wrong answer to the problem in any case. – Philip Whitehouse Commented Apr 2, 2013 at 18:53
  • Yes, it's ok for a paragraph to be split in two if it doesn't fit the page, but only as long as there is just one fragment on each page. Splitting in half until a half fits the page would result in more fragments than are actually needed, which would cause sentences to be broken in the middle within a single page, for example. Also, I can't think of a good way to iterate over all DOM elements without using recursion. – JayPea Commented Apr 3, 2013 at 18:15
Add a ment  | 

2 Answers 2

Reset to default 4

I have done something similar to this. The approach I took is to check the height of the page container. If it was greater than the max, I know elements need to be moved to the next page.

If there are multiple elements, I can move the last element to the next page.

If there is only 1 element, it needs to be split. Let's call this element X. So you can create a new paragraph/section in the next page, let's call that element Y. You can now move words or characters from the end of element X to the start of element Y until the height of element X fits into the page.

After this you can repeat for the next page.

You could have an index file and make frames. I know it's old school - but maybe?

    <html>
    <head>
    <title>Example for Stackoverflow</title>
    </head>
    <frameset rows="28,*,0" frameborder="0" border="0" framespacing="0">
        <frame name="topNav" src="top_nav.html" scrolling="no" noresize>
    <frameset cols="110,*,150" frameborder="0" border="0" framespacing="0">
        <frame name="menu" src="menu_1.html" marginheight="0" marginwidth="0" scrolling="auto" noresize>
        <frame name="content" src="content.html" marginheight="0" marginwidth="0" scrolling="auto" noresize>
        <frame name="related" src="related.html" marginheight="0" marginwidth="0" scrolling="auto" noresize>
    </frameset>
    <noframes>
    <p>This section (everything between the 'noframes' tags) will only be displayed if the users' browser doesn't support frames. You can provide a link to a non-frames version of the website here. Feel free to use HTML tags within this section.</p>
    </noframes>

    </frameset>
    </html>
发布评论

评论列表(0)

  1. 暂无评论