python - How to extract and download all images from a website using beautifulSoup? -


i trying extract , download images url. wrote script

import urllib2 import re os.path import basename urlparse import urlsplit  url = "http://filmygyan.in/katrina-kaifs-top-10-cutest-pics-gallery/" urlcontent = urllib2.urlopen(url).read() # html image tag: <img src="url" alt="some_text"/> imgurls = re.findall('img .*?src="(.*?)"', urlcontent)  # download images imgurl in imgurls:     try:         imgdata = urllib2.urlopen(imgurl).read()         filename = basename(urlsplit(imgurl)[2])         output = open(filename,'wb')         output.write(imgdata)         output.close()     except:         pass 

i don't want extract image of page see image http://i.share.pho.to/1c9884b1_l.jpeg want images without clicking on "next" button not getting how can pics within "next" class.?what changes should in findall?

if want pictures can download them without scrapping webpage. have same url:

http://filmygyan.in/wp-content/gallery/katrina-kaifs-top-10-cutest-pics-gallery/cute1.jpg http://filmygyan.in/wp-content/gallery/katrina-kaifs-top-10-cutest-pics-gallery/cute2.jpg ... http://filmygyan.in/wp-content/gallery/katrina-kaifs-top-10-cutest-pics-gallery/cute10.jpg 

so simple code give images:

import os import urllib import urllib2   baseurl = "http://filmygyan.in/wp-content/gallery/katrina-kaifs-top-10-"\       "cutest-pics-gallery/cute%s.jpg"  in range(1,11):     url = baseurl %     urllib.urlretrieve(url, os.path.basename(url)) 

with beautifulsoup have click or go next page scrap images. if want ot scrap each page individually try scrathem using there class shutterset_katrina-kaifs-top-10-cutest-pics-gallery


Comments

Popular posts from this blog

java - Run a .jar on Heroku -

java - Jtable duplicate Rows -

validation - How to pass paramaters like unix into windows batch file -