Web scrapping series :
(Dig out web data)
Part 1:
URL used for scrapping : http://www.assamyellowpage.com
Today i am going to dig out data from a website without visiting the website . Say we are going to search a word (jalukbari) in the site http://www.assamyellowpage.com . It will look like bellow .
In our result we are getting like some result .
Now my script should dig out same data from the site ,
import requests
from bs4 import BeautifulSoup
import os
a=raw_input('Enter your query:')
r=requests.get('http://www.assamyellowpage.com/search?title=&street=%s&taxonomy_vocabulary_10_tid=All'%a)
soup=BeautifulSoup(r.content)
g_data=soup.find_all("div",{"class":"item-list"})
os.system('clear')
for i in range(len(g_data[0].find_all("div",{"class":"views-field views-field-title"}))):
print 'Institiute-'+str(i),":"+g_data[0].find_all("div",{"class":"views-field views-field-title"})[i].text
print '\nDescription:',g_data[0].find_all("span",{"class":"field-content"})[i].text
After running the script you will get the following result :
We got the exact same result as we got from above search .Bellow is another search result .
Enjoy . We have just created a API for this site http://www.assamyellowpage.com.
In next article i will scrap data from different sites like facebook.com, google.com , twitter.com and and many more .
Connect With Me: Facebook
Thanks for sharing !!Excellent blog with informative concept.
ReplyDeleteDevOps Online Training institute
DevOps Online Training in Hyderabad
DevOps Course in Hyderabad