Session Management via Puppeteer

Session Management via Puppeteer

1. What is the puppeteer?

Puppeteer is a Node library which is providing a high-level API to control Chrome or Chromium over the DevTools Protocol. In simpler terms, being from the same parent gives it greater control and flexibility to work with Chrome, you can do the things manually in the browser using Puppeteer.

Puppeteer’s functionalities easily can challenge to other open-source automation tools.

2. Steps for installing puppeteer and other packages inside a project?

I. Go to your project directory and execute command: npm i puppeteer
II. Install chokidar package inside your project directory: npm i chokidar
III. Install async package inside your project directory: npm i async
IV. Finally execute npm install command inside project directory

Note: Your system must have latest nodejs installation

3. Folder Structure for session management

  • Create one file login.js - To logged in into website
  • Create one file index.js - To keep track of login and session management
  • Create one file redirect.js - To create one new file so watcher event can fire

4. Dive Deeper inside each file and understand its role

  1. Login.js: In login.js file we added code for opening the browser in headless mode and logged in via puppeteer.
    • Imports packages & setting up variable
    • const puppeteer = require("puppeteer");
       
      let page = null;
      let browser = null;
      
    • Create one login function and export this function so we can use it in another file and put our login logic inside it.
    • async function login() { // Login function start
      
    • Setting up chrome headless browser options inside login function.
    •  let chromeOptions = {
         headless: false, 
         slowMo: 10,
         defaultViewport: null,
         args: ["--no-sandbox", "--disable-setuid-sandbox"]
       };
      
    • Assign puppeteer object to browser and page variables and setting up userAgent inside login function
    •  browser = await puppeteer.launch(chromeOptions);
       page = await browser.newPage();
       
       const userAgent =
         "Mozilla/5.0 (X11; Linux x86_64)" +
         "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.39 Safari/537.36";
      
    • Redirect via puppeteer to destination site and wait until is completely loaded inside login function
    •  await page.setUserAgent(userAgent);
       
       await page.goto("http://example.com", {
         waitUntil: "networkidle0",
         timeout: 0
       });
       
       await page.waitFor(3000);
      
    • If page loaded then fill up username and password field and pressed submit button (Every step is done via puppeteer we don’t have to do anything) inside login function
    • try {
         await page.waitForSelector("#username");
       } catch (e) {
         console.log(`Error while wait for #username : ${e}`);
       }
       
       await page.type("#username", "tester");
       await page.type("#password", "tester");
       
       await page.click('button[name="submit"]');
       
       await timeout(4000);
       
       return Promise.resolve(page);
      } // Login function end
      
    • Export login function so other file can import and use login function
    • const timeout = millis => new Promise(resolve => setTimeout(resolve, millis));
      module.exports.login = login;
      
  2. Index.js: This is our entry file. It will call the login file than our login file will logged in us inside the web site after the login it is wait for some events to be happened by our chokidar’s watcher.

    • Import all the necessary packages and variable
    • const login = require("./login");
      const chokidar = require("chokidar");
      const fs = require("fs");
      const async = require("async");
      const path = __dirname + "/watcherFiles";
      const dataPath = __dirname + "/data";
      let url = "http://example.com/test";
      let statFile = "";
      let page = null;
      
    • Create one entry point function startProcess and configure watcher via chokidar
    • async function startProcess() {
       page = await login.login();
      }
       
      const watcher = chokidar.watch("watcherFiles", {
       persistent: true,
       ignoreInitial: true,
       binaryInterval: 10000
      });
      
    • Create one new function readFile which will read a file from our specified location.
    • async function readFile(path) {
         return new Promise(function(resolve, reject) {
           fs.readFile(`${path}`, { encoding: "utf-8" }, async function(err, url) {
             let urlSite = url;
             if (null === err) {
               resolve(urlSite);
             } else {
               reject(err);
             }
           });
         });
       }
      
    • Add async queue which will be used to add our task in queue and execute provided callback, Here we call main() function to process further and whenever user fire any event then those event will be added in a queue and process further
    • let q = async.queue(function(task, callback) {
         (async function() {
           let url = task.name;
           let result = await main(url);
           callback(result);
         })();
      }, 1);
      
    • Create a main function which will redirect users to the next page and also check if the user is still logged in or not.
    • function main(url) {
         return new Promise(async function(resolve, reject) {
           await page.goto(url, {
             waitUntil: "networkidle0",
             timeout: 0
           });
            if (await page.$("#username")) {
             await page.type("#username", "tester");
             await page.type("#password", "tester");
              await page.click('button[name="submit"]');
              await timeout(4000);
           }
            try {
             await page.waitForSelector(".some-class-to-load");
           } catch (e) {
             console.log(`Error while wait for .table-responsive : ${e}`);
             reject(false);
           }
           await timeout(4000);
            resolve(true);
          
         });
       }
      
    • Create watcher add event function this will be called each time when user execute our another script (redirect.js) to create new file inside our watcher Files folder and then the watcher passed this newly created event in async queue and async queue will call our main function and redirect it to appropriate page.
    • watcher.on("add", async path => {
         let url = await readFile(path);
         q.push({name: url},
           function(err, data) {
             if (null === err) {
             } else {
             }
           }
         );
       });
      
    • At the end of this file we call our startProcess() function so it will be called our login function and do login. This function will be called when start our index.js process.
    • startProcess();
      const timeout = millis => new Promise(resolve => setTimeout(resolve, millis));
      
  3. redirect.js: This script simply creates one file inside folder watcherFile so our watcher code inside index.js get execute and because our watcher's keep listing for any file event and the moment any new file created inside our watcherFile folder then our index.js watcher code get executed and rest of the process follow inside our index.js file

    • const fs = require("fs");
      const dirPath = __dirname + "/watcherFiles";
       
      async function startProcess() {
       fs.readdir(dirPath, (err, files) => {
         if (err) throw err;
         for (const file of files) {
           fs.unlink(dirPath + "/" + file, err => {
             if (err) throw err;
           });
         }
       });
       
       const file = fs.open(
         `${dirPath}/ranking_${new Date().getMilliseconds()}.txt`,
         "w",
         function(err) {
           if (err) throw err;
           console.log("File is created successfully.");
         }
       );
      }
       
      startProcess();
       
      const timeout = millis => new Promise(resolve => setTimeout(resolve, millis));
      

5. In Which sequence we have to run files

  • Node index.js file
  • Node redirect.js file

Conclusion

By using puppeteers, async and watcher we can automate any website and we can manage sessions as well.