最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

php - Automatically filling out web forms and returning the resulting page - Stack Overflow

programmeradmin0浏览0评论

This is my first time posting here. I greatly appreciate any and all guidance on this subject.

I'm trying to make a program that automatically fills in web forms and submits the data, returning the resulting page to the program so it can continue to 'browse' the page, allowing it to recursively submit even more data.

The main problems I'm having are:

  • The 'submit' button is coded in Javascript, so I don't know where the form data goes when making the page request.
  • I want to fill in the forms using data from an Excel table, so I need to be able to access data from outside the page.
  • I need to be able to navigate the resulting page to continue to submit more data.

More specifically, I'm trying to first login to the Practice Mate website, navigate to 'Manage Patients', hit 'Add Patients', and fill in the proper forms and submit. I'm filling in the forms from an Excel table thousands of rows long.
Sorry I can't be more clear on this without providing a username and password.

What I've been trying to do is use Javascript to make page requests from a page that retrieves information from the Excel document using PHP. I still can't seem to get anything to work with this method though.

I apologize for being a relative novice at this. Thanks in advance.

This is my first time posting here. I greatly appreciate any and all guidance on this subject.

I'm trying to make a program that automatically fills in web forms and submits the data, returning the resulting page to the program so it can continue to 'browse' the page, allowing it to recursively submit even more data.

The main problems I'm having are:

  • The 'submit' button is coded in Javascript, so I don't know where the form data goes when making the page request.
  • I want to fill in the forms using data from an Excel table, so I need to be able to access data from outside the page.
  • I need to be able to navigate the resulting page to continue to submit more data.

More specifically, I'm trying to first login to the Practice Mate website, navigate to 'Manage Patients', hit 'Add Patients', and fill in the proper forms and submit. I'm filling in the forms from an Excel table thousands of rows long.
Sorry I can't be more clear on this without providing a username and password.

What I've been trying to do is use Javascript to make page requests from a page that retrieves information from the Excel document using PHP. I still can't seem to get anything to work with this method though.

I apologize for being a relative novice at this. Thanks in advance.

Share Improve this question asked Jan 7, 2013 at 10:21 BaoziBaozi 681 gold badge1 silver badge7 bronze badges 8
  • Because there's some Javascript involved, you aren't going to be able to do this in PHP (as you've tagged this question). Have you considered writing this as a browser userscript or a browser extension? Also, their site TOS seems to prohibit screen-scraping, so be prepared to be actively blocked by them. – Charles Commented Jan 7, 2013 at 10:27
  • Why can't it be done in PHP? – Pastor Bones Commented Jan 7, 2013 at 10:28
  • @PastorBones, show me how to process Javascript inside HTML from within PHP and I'll change my statement. – Charles Commented Jan 7, 2013 at 10:29
  • That sounds like alot of work. Why not just use a network sniffer to determine how the form post is sent to the server and send it yourself using cURL? If you need a value from a javascript variable you could always parse the html and grab it before sending the form. I've done it plenty of times... – Pastor Bones Commented Jan 7, 2013 at 10:31
  • ok, looked at the login form. It's an aspx page. At a cursory glance it requires that a viewstate value be passed with the form data, which can be scraped from the CDATA in the page. – Pastor Bones Commented Jan 7, 2013 at 10:35
 |  Show 3 more ments

2 Answers 2

Reset to default 7

You can use PHP cURL to browse & submit forms to websites, but it does depend on how the website is setup. Most have security checks in place to prevent bots and can be tricky to get everything to work right.

I spent a little bit of time and came up with this login script. Without a valid username and password I can't verify that it is successful, but should do what you need. This short example first browses to the page to set any cookies and scrape a __VIEWSTATE value needed to submit the form. It then submits the form using the username/password you provide.

<?php

// Login information
$username = 'test';
$password = 'mypass';
$utcoffset = '-6';
$cookiefile = '/writable/directory/for/cookies.txt';

$client = new Client($cookiefile);

// Retrieve page first to store cookies 
$page = $client -> get("https://pm.officeally./pm/login.aspx");
// scrape __VIEWSTATE value
$start = strpos($page, '__VIEWSTATE" value="') + 20;
$end = strpos($page, '"', $start);
$viewstate = substr($page, $start, $end - $start);

// Do our actual login
$form_data = array(
    '__LASTFOCUS' => '', 
    '__EVENTTARGET' => '',
    '__EVENTARGUMENT' => '',
    '__VIEWSTATE' => $viewstate,
    'hdnUtcOffset' => $utcoffset,
    'Login1$UserName' => $username,
    'Login1$Password' => $password,
    'Login1$LoginButton' => 'Log In'
);
$page = $client -> get("https://pm.officeally./pm/login.aspx", $form_data);

// cURL wrapper class    
class Login {
    private $_cookiefile;

    public function __construct($cookiefile) {
        if (!is_writable($cookiefile)) {
            throw new Exception('Cannot write cookiefile: ' . $cookiefile);
        }
        $this -> _cookiefile = $cookiefile;
    }

    public function get($url, $referer = 'http://www.google.', $data = false) {
        // Setup cURL
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_REFERER, $referer);
        curl_setopt($ch, CURLOPT_COOKIEFILE, $this -> _cookiefile);
        curl_setopt($ch, CURLOPT_COOKIEJAR, $this -> _cookiefile);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_AUTOREFERER, true);
        curl_setopt($ch, CURLOPT_MAXREDIRS, 10);

        // Is there data to post
        if (!empty($data)) {
            curl_setopt($ch, CURLOPT_POST, 1);
            curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
        }

        return curl_exec($ch);
    }

}

Well, I think the cURL will do the trick, the curl_init() handler is explicable enough. Still at the inception of the doc peruse, howbeit, good results are anticipated. Well, not too sure about the PHP flexibility of structures as that will mean a lot with cURL. Hope to find good luck down the line.

发布评论

评论列表(0)

  1. 暂无评论