PHP Classes

Class to convert HTML into objects.: Class to convert HTML into objects, XML DOM style.

Recommend this page to a friend!
  All requests RSS feed  >  Class to convert HTML into objects.  >  Request new recommendation  >  A request is featured when there is no good recommended package on the site when it is posted. Featured requests  >  No recommendations No recommendations  

Class to convert HTML into objects.

Edit

Picture of Everton da Rosa by Everton da Rosa - 10 years ago (2014-11-12)

Class to convert HTML into objects, XML DOM style.

This request is clear and relevant.
This request is not clear or is not relevant.

0

Hello friends, I need a class that transforms the HTML code (which can be read from a string or a file) on an object, the style of the XML DOM with access to tags, attributes and content of tags.

  • 1 Clarification request
  • 1. Picture of Manuel Lemos by Manuel Lemos - 10 years ago (2014-11-13) Reply

    What about the DOM classes that come with PHP?

    • 2. Picture of Everton da Rosa by Everton da Rosa - 10 years ago (2014-11-13) in reply to comment 1 by Manuel Lemos Comment

      You refers to XML manipulation classes? Say something look like them, but shall apply to HTML documents. Thought to use the DOM class, but could have problems with content tags with characters such as "<" for example, which documents are XML be used with CDATA markup.

    • 3. Picture of Manuel Lemos by Manuel Lemos - 10 years ago (2014-11-13) in reply to comment 2 by Everton da Rosa Comment

      Yes, DOMDocument has a loadHTML function to parse HTML.

      I am not sure what is you concern with CDATA sections. I think they are like regular data sections. They are decoded but tags characters < and > are returned without special meaning, just like every other character.

      Did you try that or did you have any difficulties?

    • 4. Picture of Everton da Rosa by Everton da Rosa - 10 years ago (2014-11-18) in reply to comment 3 by Manuel Lemos Comment

      Tanks, I will test the DOMDocument class.

Ask clarification

1 Recommendation

HTMLPP: Parse HTML code and manage the DOM structure

HTMLPP is a PHP4 library for HTML code parsing. It allows you to parse a HTML code string, build the relative DOM structure and work on it with methods similar to Javascript.

Features:


HTML parsing:
- Simple tags
- Tags without closures
- Autoclosing tags
- Doctype, text and comment parsing
- Modern browser parsing behaviour (Add head,body and html tags if they're not present, Wrap table content inside the tbody if it's not present)

Dom traversing:
- Access to the parent node using the parentNode property
- Access to child nodes using the childNodes array property
- Access to sibling nodes using nextSibling and previousSibling properties
- Access to the owner document with ownerDocument property
- Document shortcuts to body, head and doctype

Dom manipulation:
- Append nodes with appendChild, append and other methods
- Remove nodes with removeChild and remove methods
- Replace nodes with replaceChild and replace methods

Attributes and style manipulation:
- Add, remove, set and get methods for attributes
- Add, remove, set and get methods for style properties

Node searching functions on every element:
- getElementById
- getElementsByTagName
- getElementsByClassName
- getElementsBySelector (Full featured support for Css3 selectors, Support for other non-standard selectors)
- Node iterator class for personalized filter functions

Dom collections with JQuery like methods:
- Add, remove and filter elements in the collection
- Change the current collection by searching in its elements siblings, child nodes or parent nodes
- Manipulate elements in the collection



Changelog:

1.0
- first release
1.0.1
- Fixed some bugs in elements parsing regexp
- Fixed a bug in doctype parsing
- Fixed some problems in the parser class
- Fixed a bug in HTMLFilterIterator::find() function when pass HTML_SEARCH_DESCENDANT as iteration type
1.0.2
- Fixed error on selector parsing
- Now every element is closed at the end of its parent code if no closing tag is found
- Better support for textarea tag
- Fixed bug on attributes parsing (thanks Mike)
1.0.3
- Fixed bug in getAttribute() method
- Fixed bug in getStyle() method
- Fixed bug on attributes parsing
This recommendation solves the problem.
This recommendation does not solve the problem.

0

Picture of Manuel Lemos by Manuel Lemos Reputation 26695 - 9 years ago (2016-02-14) Comment

There is this old class that can parse HTML using pure PHP and return a DOM like document structure.

For most purposes the PHP DOM extensions may be more useful but if you stumbled in a limitation of those extensions, you may want to try this package.


Recommend package
: 
: