Puppeteer JS Renderer
Lets get started
To install this project locally, execute the following commands:
git clone https://github.com/fly-examples/puppeteer-js-renderer.git cd puppeteer-js-renderer npm install
To quickly try the project out after installing it, run the command below:
This will show something like below:
Pulling views from YouTube, please wait... Luis Fonsi - Despacito ft. Daddy Yankee has 6,881,463,846 views
If you want to try any other video just pass the YouTube video URL as a parameter to the script like below:
node yt-views https://www.youtube.com/watch?v=XqZsoesa55w
It should give you an output like this:
Pulling views from YouTube, please wait... Baby Shark Dance | Sing and Dance! | @Baby Shark Official | PINKFONG Songs for Children has 6,077,338,169 views
To show the YouTube video views count, a small scraper written with Axrio npm package was executed. Axrio combines the popular Axios library and Cheerio to create a mini scraper. Axios is used to make requests and Cheerio acts like DOM navigator parsing the markup and giving us an API for traversing/manipulating the resulting data structure. You can kickstart a small scraper with Axrio too.
It uses the puppeteer-js-renderer service that is already deployed and running on Fly.io. Have a look at he "how to use it as a service" section for more information. To start your own instance on Fly.io jump to the "how to deploy on fly.io" section.
If you have node installed on your machine, you already cloned this repository and ran
npm install. Now run
npm start to get this service running locally.
The next step is to navigate to
If you want to run with Docker, execute the following after you clone the repository:
cd puppeteer-js-renderer docker-compose up
Then go to
http://localhost:8080/api/render?url=https://www.youtube.com/watch?v=kJQP7kiw5Fk in your browser to see the output.
If you want to use puppeteer-js-renderer for scraping, you can use the following URL on Fly.io:
Styles and images will look broken but the HTML tags will be there. Happy Web Scraping!
Fly.io is a platform for applications that need to run globally. It runs your code close to users and scales compute in cities where your app is busiest. Write your code, package it into a Docker image, deploy it to Fly's platform, and let that do all the work to keep your app snappy.
Please follow these steps to deploy your own puppeteer-js-renderer service on Fly.io:
- Install the flyctl CLI command.
- Register on fly with
flyctl auth signup, if you already have a fly account log in with
flyctl auth login.
- Clone this repo with
git clone email@example.com:fly-examples/puppeteer-js-renderer.gitif you are logged in with SSH support enabled. Otherwise try
git clone https://github.com/fly-examples/puppeteer-js-renderer.git.
- Then run
- After that execute
flyctl init --dockerfileand when asked for an app name, hit return to have one generated (unless there's a name you really want). I ran it with js-renderer-fly as the app name for this example.
- Subsequently, you can select an organization. Usually this will be your first name-last name on the prompt.
- It should create a fly.toml file in the project root (I have not committed it, it is in .gitignore).
- Now run
flyctl deployto deploy the app. It will build the docker image, push it to the fly docker image registry and deploy it. In the process of deploying it, it will display information about the number of instances and their health.
- You can then run
flyctl info. This will give the details of the app including hostname.
- You can view your app in the browser with
flyctl open. For me, it opened
/api/render?url=https://www.youtube.com/watch?v=kJQP7kiw5Fkon the address bar of your browser after your puppeteer-js-renderer URL. Mine looked like
You can suspend your service with
flyctl suspend; this will pause your service until you resume it. If you try
flyctl status after suspending it will not show any instances running. Suspending the service means all resources allocated to run the service will be deallocated, resulting in the application running nowhere. To get the instances running again execute
I wanted to see what resources were allocated to the App on Fly by default. The scale commands allowed me to find it out pretty easily like below:
flyctl scale show- showed me VM Size: micro-2x
flyctl scale vm- showed me micro-2x is a 0.25 CPU cores with 512 MB of memory.
If you want to increase CPU/memory or run more instances in a particular region please refer to the official Fly docs on scaling.
Our service is running in one data center. For me, it's iad (Ashburn, Virginia) but yours will likely be different based on where you are working from. We can add instances around the world to speed up responses, let's get rolling:
- To see the regions available run
flyctl platform regions, I could see regions all over the world from Oregon to Sydney.
- Let's add an instance to Australia in Sydney, to do this run
flyctl regions add syd, yes it is that easy.
- Now check
flyctl statusand you will see an instance running in Sydney
- Let’s add one more in Europe in Amsterdam with
flyctl regions add ams. So now we are mostly covered with the app running in 3 continents.
- Of course, you can run
flyctl statusagain to see your app shining on 3 continents.