Retrieve a page content #crawler
Edit
by Hocine Ferradj - 9 years ago (2015-11-03)
I need a crawler to get a data from an url
| I need a crawler to get a data from an url |
Ask clarification
2 Recommendations
This class can parse and extract Web page information details.
It can retrieve a Web page from a given URL and parse it to extract details like:
- Page title
- Page head and body
- Meta tags
- Character set
- Links expanded to full path
- Images
- Page headers from H1 through H6
- Internal and external links checking if they are broken
- Page elements by class or id value
| by zinsou A.A.E.Moïse package author 6835 - 7 years ago (2017-12-22) Comment
you can also try this... |
HTML Parser: Parse HTML using DOMDocument
This class can parse HTML documents using DOMDocument.
It can load the HTML markup either from a file or from a text string.
It can parse the entire document, returning an array of elements.
It can parse the document for a specific element, returning an array of each element found. It also can return the element's child elements.
It can return an element referenced by a given ID.
It can display the returned results in a human readable form.
| by Dave Smith package author 7620 - 9 years ago (2015-11-03) Comment
You can take a look at this one. It was written as a challenge to parse a document using only DOMDocument in pure PHP so you will wind up with all the elements in an array or you can specify a specific element. |