Implicit Evaluation with PHP

14 November 2006

History of the World Part XIV: Wherein PHP Assumes Babysitting Responsibilities

I’m a responsible programmer. Really, I am. I ask for advice from my peers whenever the path points to something I feel unsure about. I use (perhaps overuse, but thats another column) functions responsibility. Regardless of what the user is told, I protect against user input internally. And so when I want to write a line of code, I expect it to do what its intended to do.

Unfortunately, PHP being the fantastic language that it is, attracted many “programmers” who don’t care about the subtleties of development. They’re the type who pass 0’s or NULLs to functions when they don’t understand what is happening. They assign TRUE/FALSE flags on the branches of an if. The most abled of them are responsible for developers accepting invalid input and working around it.

One of any web applications most common issues is SQL Injection. Most desktop applications are big black boxes and aren’t especially susceptible to them. However, the web as HTML and HTTP allow freely accepts any input from the user, in terms both of what it expects and what it doesn’t. HTTP is stateless, it can’t know any better. And HTML doesn’t have any way to filter or suppress any user input ahead before it’s handled. And so enters on the subtleties of web development: cleaning user input.

PHP has no mechanism for this. Perl has a convoluted idea of “taint mode.” In taint mode, any variables which come from outside of the script is considered tainted, and places where it’s use could be considered dangerous disallow use of tainted variables. On the surface it seems fine, but real world issues quickly arise. Perhaps not often, but frequently, user supplied data must be used in dangerous situations. In most languages, it can be sanitized and then used. But in Perl, it’s tainted. Any variable which touches it is tainted by association. The solution is to run it through a regex call which returns the unmodified variable. And from that point, it is all too easy to use that everywhere.

PHP takes a different approach. It has many features which make it “newbie-friendly”: Magic quotes and safe mode are the most obvious. Magic quotes takes all user supplied data and escapes it (so if the user puts “; drop table users into a textbox, PHP will supply it to the developer as \”; drop table users). On the surface, it seems nice. But it’s impractical. While it does help to prevent SQL-injection, it and many other safety features are configurable, so escape character can too easily double up. A program needs to manage its own data and outside things changing data makes for difficult to diagnose bugs.

I thought I knew about all the babysitting roles PHP assumed, but recently found another. One common SQL injection type is dropping tables like illustrated above. For one particular program, I needed to truncate a table and then rebuild it based on another database. Knowing all about security which I do, I studied the query, looking for any vulnerabilities. I considered what would happen if it failed. And I wrote mysql_query("truncate table documents; select dev.documents.* into ;”);. But for some reason, it did not work. As it turns out, mysql_query has its own babysitting built in. It turns out that it will not allow stacked queries.

This is insane. Sure, it’s not difficult to call mysql_query multiple times, or exploding a query by ; and looping over the results. But the fact you have to is ludicrous. As a developer, it’s important for code to do what you want. Man made limitations like this just get in the way. Properly factoring and debugging applications is already work enough. I don’t need to spend time debugging a properly working SQL statement just because some developers don’t really know how to program.

No Comments currently posted.

Post a comment on this entry: