Browser Emulation Using PhantomJS


PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.

The website(https://phantomjscloud.com) is awesome. But its free plan(500 Pages/Day) is not enough to use. So I find some open-source project which can deploy in Docker hosting or PaaS. For example, I will deploy b1nitp7iw/phantomjs-server in the Daocloud which offers deploy three docker app for free.

Deployment

Login your daocloud account. Use the keyword “b1nitp7iw” to search the image.

Click the image and be ready to deploy.

Choose to testing plan to deploy.

Set port to 8080.

Finally, you can see the app link on the dashboard.

Tesing

It will return nothing to directly run curl.


It’s ok to run curl with phantomjs-server.

Prevent App Sleep

Althougn the Daocloud offer free plan, the apps will stop after 24 hours. We can use the restart API to prevent sleeping. Before this,you should know the APP ID and API token.
Get your APP ID.

Get your API token.

Use the PostAgent of huginn to post the request to API and set the Schedule less than 24 hours.

Usage

Send the following link by GET method and obtain the response.
http://ze*****omjs.daoapp.io/getDom?url=http://example.com
This is a example in huginn.
Flow:RSS Agent–>Website Agent–>Data Output Agent
RSS Agent(gain the article list):

Website Agent(extract the full text):

Data Output Agent(generate the final rss):

Similar Projects

These are based on PhantomJs but not as stable as the image (phantomjs-server).
https://github.com/vbauer/manet
https://github.com/Netflix/sketchy

Leave a Reply

Your email address will not be published. Required fields are marked *