You can import the HTML dom using html dom library in python. You can find it over here and install it using PIP:
from htmldom import htmldom dom = htmldom.HtmlDom("https://www.github.com/") dom = dom.createDom()
The above code creates a HtmlDom object.The HtmlDom takes a default parameter, the url of the page. Once the dom object is created, you need to call "createDom" method of HtmlDom. This will parse the html data and constructs the parse tree which then can be used for searching and manipulating the html data. The only restriction the library imposes is that the data whether it is html or xml must have a root element.
You can query the elements using the "find" method of HtmlDom object:
p_links = dom.find("a") for link in p_links: print ("URL: " +link.attr("href"))
The above code will print all the links/urls present on the web page