python - How do I display sentences from a website? -


i decided make little project learn how use mechanize. goes urbandictionary, fills in word 'skid' inside search form , press submit , prints out html.

what want find first definition , print out. how go , that?

this source code far:

import mechanize  br = mechanize.browser() page = br.open("http://www.urbandictionary.com/")  br.select_form(nr=0) br["term"] = "skid" br.submit()  print br.response().read() 

here's definition stored:

<div class="definition">canadian definition: commonly used refer        stopped evolving, , bathing, during 80&#x27;s hair band era.  can found wearing ac/dc muscle shirts, leather jackets, , sporting <a href="/define.php?term=mullet">mullet</a>.  term &quot;skid&quot; in part derived &quot;skid row&quot;, both band enjoyed term refers to, address.  see <a href="/define.php?term=white%20trash">white trash</a> , <a href="/define.php?term=trailer%20park%20trash">trailer park trash</a></div><div class="example">the skid next door got drunk , beat old lady.</div> 

you can see it's stored inside div definition. know how search div inside source code don't know how take that's between tags , display it.

you can use lxml parse html fragment:

import lxml.html html import mechanize  br = mechanize.browser() page = br.open("http://www.urbandictionary.com/")  br.select_form(nr=0) br["term"] = "skid" br.submit()  fragment = html.fromstring(br.response().read())  print fragment.find_class('definition')[0].text_content() 

this solution removes in tags inside div , flattens text, however.


Comments

Popular posts from this blog

java - Run a .jar on Heroku -

java - Jtable duplicate Rows -

validation - How to pass paramaters like unix into windows batch file -