Wednesday, November 9, 2011

An interesting find

We needed to expose a resource such that its url looked like:

http://www.example.com/widgets/<url-encoded-identifier>

In some cases, we found that the request never made it to the controller. The reason for that was when the identifier contained %2F (url encoded '/') followed by other characters, it appeared as if either apache or passenger url-unencoded it before trying to figure out who send the request to. And it couldn't find anything, so it returned a 404.

If we start the server using rackup instead of via Apache Passenger, the request would be routed correctly.

Now, if the identifier contained a space (urlencoded as +), then we saw another problem. We know Rails tries to help by unescaping parameters. When the param comes from a URL path, Rails uses URI::unescape. This leaves the + in! The parameter received by the controller is incorrect. If it was only the +, it would be one thing. It turns out that URI::unescape is depracated in Ruby 1.9.2 and it doesn't work for a few other characters as well. So gsub wasn't an option.

In Ruby 1.9.2, URI::unescape is replaced by CGI::unescape. And it behaves correctly. However, in Ruby 1.8.7, CGI::unescape is not considered stable. So, Rails had to decide which Ruby it will support better. It seems they have taken a half-way approach. Rails uses URI::unescape when unescaping params from the URL path, but it seems to use CGI::unescape when the param comes from the query string!

Given the two issues, we had to change our URL structure to something like:

http://www.example.com/widgets/<hex-identifier>?id=<url-encoded-identifier>

Grrr...

Does anyone have a better solution?