Monday, 14 November 2016

web scrapping with python BeautifulSoup


Web scrapping series :

(Dig out web data)

Part 1:

URL used for scrapping :   http://www.assamyellowpage.com

Today i am going to dig out data from a website without visiting the website . Say we are going to search a word (jalukbari) in the site http://www.assamyellowpage.com . It will look like bellow .


 

In our result we are getting like some result .

Now my script should dig out same data from the site ,


import requests
from bs4 import BeautifulSoup
import os
a=raw_input('Enter your query:')
r=requests.get('http://www.assamyellowpage.com/search?title=&street=%s&taxonomy_vocabulary_10_tid=All'%a)
soup=BeautifulSoup(r.content)
g_data=soup.find_all("div",{"class":"item-list"})
os.system('clear')
for i in range(len(g_data[0].find_all("div",{"class":"views-field views-field-title"}))):
        print 'Institiute-'+str(i),":"+g_data[0].find_all("div",{"class":"views-field views-field-title"})[i].text
        print '\nDescription:',g_data[0].find_all("span",{"class":"field-content"})[i].text


After running the script you will get the following result :





 







We got the exact same result as we got from above search  .Bellow is another search result .










Enjoy . We have just created a API for this site  http://www.assamyellowpage.com.

In next article i will scrap data from different sites like facebook.com, google.com , twitter.com and and many more .

Connect With Me: Facebook


1 comment:

Popular Posts