最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Wodpress XML Import hooks

programmeradmin1浏览0评论

Having the task to merge posts from given remote sites in one site under a given category how do I hook the import plugin to save the posts for a given category:

Given example:

receiver site has category auto and I want to import all the posts from (ex.) auto under this category. If remote post has another category as auto add as child.

internal post images should be downloaded and after all the links inside post updated, therefore I found already a core method and here is the my way as I try, but I think this can be made more simple

<?php


if (!class_exists('Wp_Http'))
    include_once(ABSPATH . WPINC . '/class-http.php');

require_once ABSPATH . 'wp-admin/includes/import.php';

if (!class_exists('WP_Importer')) {
    $class_importer = ABSPATH . 'wp-admin/includes/class-wp-importer.php';

    if (file_exists($class_importer)) {
        require $class_importer;
    }
}


class WordpressMigration extends WP_Importer
{
    public $wpXML;

    public $xml;

    public $domain;

    function __construct($wpXML)
    {
        $this->wpXML = $wpXML;

        $this->xml = simplexml_load_file($this->wpXML);

        $this->domain = (string)$this->xml->channel->link;

    }

    public function getPosts()
    {
        $this->xml = simplexml_load_file($this->wpXML);
        $posts = array();

        /* import authors */
        $authors = $this->xml->channel->children('wp', true);

        foreach ($authors->author as $author) {

        }

        foreach ($this->xml->channel->item as $item) {
            $categories = array();

            foreach ($item->category as $category) {
                //echo $category['domain'];
                if ($category['nicename'] != "uncategorized" && $category['domain'] == "category") {

                    $categories[] = $category['nicename'];
                }
            }
            $content = $item->children('content', true);
            $doc = new DOMDocument();
            $doc->loadHTML(mb_convert_encoding(html_entity_decode($content->encoded), 'HTML-ENTITIES', 'UTF-8'));
            $imgs = $doc->getElementsByTagName('img');

            //get the remote images and upload to media library
            if ($imgs instanceof DOMNodeList) {
                foreach ($imgs as $i => $img) {
                    $http = new WP_Http();
                    $targetImage = $img->getAttribute('src');
                    $response = $http->request($targetImage);

                    if (!is_array($response) || $response['response']['code'] != 200) {
                        //write_log
                    }

                    if (is_array($response)) {

                        $upload = wp_upload_bits(basename($targetImage), null, $response['body']);
                        if (!empty($upload['error'])) {
                            //write_log
                        }

                        $img->setAttribute('src', $upload['url']);
                        $doc->getElementsByTagName('img')->item($i)->nodeValue = $upload['url'];

                    }
                }
            }

            $targetLinks = $doc->getElementsByTagName('a');

            if ($targetLinks instanceof DOMNodeList) {
                foreach ($targetLinks as $i => $targetLink) {
                    var_dump($targetLink->getAttribute('href'));
                    die;
                }
            }

            $posts[] = array(
                "title" => $item->title,
                "content" => htmlentities(html_entity_decode($doc->saveHTML())),
                "pubDate" => $item->pubDate,
                "categories" => implode(",", $categories),
                "slug" => str_replace("/", "", str_replace("", "", $item->guid))
            );
        }


        return $posts;
    }
}

?>

Having the task to merge posts from given remote sites in one site under a given category how do I hook the import plugin to save the posts for a given category:

Given example:

receiver site has category auto and I want to import all the posts from (ex.) auto under this category. If remote post has another category as auto add as child.

internal post images should be downloaded and after all the links inside post updated, therefore I found already a core method and here is the my way as I try, but I think this can be made more simple

<?php


if (!class_exists('Wp_Http'))
    include_once(ABSPATH . WPINC . '/class-http.php');

require_once ABSPATH . 'wp-admin/includes/import.php';

if (!class_exists('WP_Importer')) {
    $class_importer = ABSPATH . 'wp-admin/includes/class-wp-importer.php';

    if (file_exists($class_importer)) {
        require $class_importer;
    }
}


class WordpressMigration extends WP_Importer
{
    public $wpXML;

    public $xml;

    public $domain;

    function __construct($wpXML)
    {
        $this->wpXML = $wpXML;

        $this->xml = simplexml_load_file($this->wpXML);

        $this->domain = (string)$this->xml->channel->link;

    }

    public function getPosts()
    {
        $this->xml = simplexml_load_file($this->wpXML);
        $posts = array();

        /* import authors */
        $authors = $this->xml->channel->children('wp', true);

        foreach ($authors->author as $author) {

        }

        foreach ($this->xml->channel->item as $item) {
            $categories = array();

            foreach ($item->category as $category) {
                //echo $category['domain'];
                if ($category['nicename'] != "uncategorized" && $category['domain'] == "category") {

                    $categories[] = $category['nicename'];
                }
            }
            $content = $item->children('content', true);
            $doc = new DOMDocument();
            $doc->loadHTML(mb_convert_encoding(html_entity_decode($content->encoded), 'HTML-ENTITIES', 'UTF-8'));
            $imgs = $doc->getElementsByTagName('img');

            //get the remote images and upload to media library
            if ($imgs instanceof DOMNodeList) {
                foreach ($imgs as $i => $img) {
                    $http = new WP_Http();
                    $targetImage = $img->getAttribute('src');
                    $response = $http->request($targetImage);

                    if (!is_array($response) || $response['response']['code'] != 200) {
                        //write_log
                    }

                    if (is_array($response)) {

                        $upload = wp_upload_bits(basename($targetImage), null, $response['body']);
                        if (!empty($upload['error'])) {
                            //write_log
                        }

                        $img->setAttribute('src', $upload['url']);
                        $doc->getElementsByTagName('img')->item($i)->nodeValue = $upload['url'];

                    }
                }
            }

            $targetLinks = $doc->getElementsByTagName('a');

            if ($targetLinks instanceof DOMNodeList) {
                foreach ($targetLinks as $i => $targetLink) {
                    var_dump($targetLink->getAttribute('href'));
                    die;
                }
            }

            $posts[] = array(
                "title" => $item->title,
                "content" => htmlentities(html_entity_decode($doc->saveHTML())),
                "pubDate" => $item->pubDate,
                "categories" => implode(",", $categories),
                "slug" => str_replace("/", "", str_replace("", "", $item->guid))
            );
        }


        return $posts;
    }
}

?>
Share Improve this question edited May 9, 2019 at 18:56 fefe asked May 9, 2019 at 18:39 fefefefe 8943 gold badges14 silver badges34 bronze badges 2
  • 1 You've edited with some code since my answer. This is one way to do it but you may try to use the wp_import_post_data_raw filter instead. It could save you a good deal of parsing. – Stephan Samuel Commented May 9, 2019 at 19:05
  • thanks for feedback, I will try! – fefe Commented May 9, 2019 at 19:48
Add a comment  | 

1 Answer 1

Reset to default 0

This is slightly complicated. I could think of 3 ways to do this:

  • Use someone else's importer plugin. Importing stuff into WP using complicated criteria is a solved problem. However, I've never met an importer plugin that I liked which was free. Most of the projects I work on have less cash and more developer hours, but if yours doesn't, this is going to be the easiest.
  • Modify the XML. There are a few ways to do this. You could write code to do this. It's always better to parse XML semantically rather than as strings (and there are many libraries to help) but most people still do it as strings. You could do it with XSLT. You could also just munge it in with Excel. I've done all three at some point and this qualifies as an ETL task. ETL tasks always start off as really simple and usually end as complicated.
  • Hook the importer. There are some hooks already. This ticket in Make seems to have been included since 3.4.1 and says it included a bunch of hooks that could be useful for you, like wp_import_posts. Either way, you're going to need to write code in some plugin. This would take care of much of the ETL complexity since at least some of the import stuff is already taken care of.

Choose your path... if it's one of these, add a comment and I'll try to help you more.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论