In one of the early columns, I introduced the idea of automating the Chrome browser so that it can perform repetitive tasks in the absence of an API. Our idea is to take a screenshot of a web page and apply object detection to the image to get structured data, so browser automation can automate part of the process. For example, if you have a list of product URLs, you can run the puppeteer script (or pyppeteer in Python) to open all the web pages and save the screenshot to disk. Well, screenshots alone are not enough. You also need a label for each structured data type and bounding box. There are at least two approaches to doing this. The first obvious way is to manually label the screenshots to the data entry person / team.
This is the approach I took when I first tried it a few years ago. How to use computer vision to automatically generate structured data Using an open ghost mannequin effect source tool called LabelImg, the data entry person manually drew a box over the object of interest. advertisement Continue reading below Labelin