Implicit Evaluation with PHP

31 October 2006

Modifying Recursive Data Structures in PHP

I’ve been working on code to allow pure HTML pages to be used as the source code of templates. Compiling this HTML mandates the use of a recursive data structure to form a tree of HTML tags and to preserve the textual content of the page. It’s reasonably easy to do in PHP5 because of the improved reference system in that platform. However, OSX 10.4 is bundled with PHP4 and that happens to be my home development machine. After many hours of massaging what is a reference and what expects a reference, I ended up with very ugly code which seemed to correctly generate the tree.

However, there’s more to the project than simply generating a symbolic representation of HTML. I must be able to operate on and transform it. And I quickly concluded that would be impractical. The code to simply add a tag to a document is already entirely too convoluted, and it’s gone through multiple iterations of cleanup and re-factoring. One of the major reasons I opted to use a recursive object-oriented data structure like this was to simplify this phase of substitution. But PHP4 has made this all too hard to do.

PHP4 compatibility is too important to just drop. Over the last year, there has been a noticeable increase in the numbers of PHP5 web hosts (I remember searching for a host in late 2004!) there are still too many PHP4-only hosts to just drop it. Which means the data structure must be revised. However, this will not solve all the problems: If I maintain an instanced-style model, I will still have problems with references. If I abort the instance-styled model, it cripples users of the HTML Parsing library. It will be easier to track just one reference instead of the recursive model I use now, but that only means the code won’t be any more complicated. It won’t be any simpler, either, which is tragic when you can see how little stupid-code PHP5 needs. The final option would be to reject all HTML, and only offer well-formed XML documents for templates.

I’m leaning towards dropping PHP4 in the present, and that as Fortitude moves forward, the requirements for HTML operations will become more defined. Then PHP4 support could be re-instated. But for now, all I see is PHP4’s deficiencies becoming more apparent as I have become more and more reliant on what PHP5 offers for granted.

2 Comments currently posted.

Moeh Bass says:

Hi,
I am glad I found your blog. As you state:

“It’s reasonably easy to do in PHP5 because of the improved reference system in that platform.”

Has PHP 5 changed how references are handled? If so, where could I find an article to discuss that? I found an article discussing references in 4.3.0:
References in PHP: An In-Depth Look by Derick Rethans, PHP Architect, June 2005.

My warmest thanks to you if you can provide my with any help,
Fellow PHP programmer (migrating from Java)
– Moeh Bass

Brian W. Bosh says:

Moeh,

The biggest change I encounter between PHP4 and PHP5 is how foreach works.

foreach ($array as $key => $val) {
// PHP4. Key is a value, val is a value
// PHP5. Key is a value, val is a reference
}

I hate reading things like “sometimes doesn’t work” because it doesn’t logically make sense, but it also seems that when returning references from functions, PHP4 sometimes doesn’t work. I just haven’t been able to determine WHY.

You can see some of the other differences here: http://www.alternateinterior.com/2006/06/differences-between-php4-and-5s-object-model.html

Post a comment on this entry: