Skip to main content

python code to Scrap images from google

Web scraping is a mechanism of using bots to extract data / content from the internet / website .
The web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.[source wiki]
How to scrap images from google?
we will use python as a base language and libraries like beautifulsoap ,selenium ,os ,time etc to create a scraper from scratch.
## Required Libraries
## 'download_image' : Method to download each image with help of requests library and if the status returned is 200 then we will write the image into our machine via file handling.
## 'download_failed : it is a variable with which we will track number of downloads failed ## 'to_search' : is a variable which will ask user to input the topic-name of which images user want to download in the directory. ## Create directory with the name of the topic about which we want to download the images. ## we need to give the path of the webdriver which we have installed .So that script can open the chrome browser automatically. ## whatever topic name which user has entered after executing script ,which to_search variable holds will be formated with the google link ,and we will use the get method to open the url automatically. once we open the url we will scroll the webpage 5 times to download more and more images possible. ##get the source of the page and use the beautifulsoup library to find all the 'div' corresponding to 'class' we specified in the find_all method . This will find all the images from the page. ## if you will check the google images in the browser mostly you will see the suggestion box at every 25th interval like 25 ,50 ,75 etc. ,so if we click on this url then rather then downloading the image it will take us to some link page. so we will skip clicking on these box and for rest of the elements which are an actual images we will find the xpath and script will click on them and if the image on which script clicked is different or just a thumbnail then we will proceed downloading an original image rather then that thumbnail.and At an end we will display the number of images were successfully downloaded and number of images which were failed. you can find all the images in the working directory in the folder named as topic which you entered on executing the script
Usage :

You can find the whole source code from this repo : scraper

Checkout some more hacking scripts

1. Sign & Verify message 1. Bluetooth discovery 1. Stealing saved wifi password from windows 1. Hacking commands with Kali Linux 1. Command and Control Trojan 1. Dictionary Attack 1. Man in Browser Attack
2. SandBox Detection 2. Bluetooth SDP browsing 2. Sniffing packets 2. Reverse shell in python
3. TCP Proxy 3. Bluetooth OBEX 3. Email Credential sniffers 3. Keylogger
4. Bluetooth RCOMM channel scanner 4. Screenshot with Python
5. Blue Bug Exploit 5. Backdoor with Python
6. Blue Snarf Exploit
7. Bluetooth spoofing
8. Bluetooth sniffing