One of the benefits we might get from the new architecture I'm testing is a much more integrated and simpler feed management, so while adapting the current code to the new class layout I took the occasion to revise or support for Conditional Get.
I had already adapted Simon Willison's code to work around a bug in PHP's
date
formats (fixed in 4.3.11), but the code was a bit too complex for my tastes, and re-reading the relevant parts of the HTTP/1.1 specification, I noticed that full compliance requires supporting three different date formats, all in GMT:
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() formatAll HTTP date/time stamps MUST be represented in Greenwich Mean Time (GMT), without exception.
HTTP/1.1 clients and servers that parse the date value MUST accept all three formats (for compatibility with HTTP/1.0), though they MUST only generate the RFC 1123 format for representing HTTP-date values in header fields.
So I rewrote the relevant part of our code from scratch, and got the following function. It's still untested (I know, I know, I should write unit tests first...) Please let me know about any problems/missing features/errors.
Update
I kept wondering if checking the dates as strings is enough, and finally decided to check RFC2616 again and found the following sentence in section 14.25
When handling an If-Modified-Since header field, some servers will use an exact date comparison function, rather than a less-than function, for deciding whether to send a 304 (Not Modified) response. To get best results when sending an If-Modified-Since header field for cache validation, clients are advised to use the exact date string received in a previous Last-Modified header field whenever possible.
So maybe dates as strings are enough, even though proper date checking would be better.
// ETag is any quoted string
$etag = '"'. $tstamp .'"';
// RFC1123 date, see http://bugs.php.net/bug.php?id=31842
if (version_compare(PHP_VERSION, "4.3.11", ">="))
$format = 'r';
else
$format = 'D, d M Y H:i:s O';
$rfc1123 = substr(gmdate('r', $tstamp), 0, -5) . 'GMT';
// RFC1036 date
$rfc1036 = gmdate('l, d-M-y H:i:s ', $tstamp) . 'GMT';
// asctime
$ctime = gmdate('D M j H:i:s', $tstamp);
// Send the headers
header("Last-Modified: $rfc1123");
header("ETag: $etag");
// See if the client has provided the required headers
$if_modified_since = $if_none_match = false;
if (isset($_SERVER['HTTP_IF_MODIFIED_SINCE']))
$if_modified_since = stripslashes($_SERVER['HTTP_IF_MODIFIED_SINCE']);
if(isset($_SERVER['HTTP_IF_NONE_MATCH']))
$if_none_match = stripslashes($_SERVER['HTTP_IF_NONE_MATCH']);
if (!$if_modified_since && !$if_none_match) {
// both are missing
return $rfc1123;
}
// At least one of the headers is there - check them
// check etag if it's there and there's no if-modified-since
if ($if_none_match) {
if ($if_none_match != $etag) {
// etag is there but doesn't match
return $rfc1123;
}
if (!$if_modified_since && ($if_none_match == $etag)) {
header('HTTP/1.0 304 Not Modified');
exit;
}
}
if ($if_modified_since) {
// check if-modified-since
foreach (array($rfc1123, $rfc1036, $ctime) as $d) {
if ($d == $if_modified_since) {
// Nothing has changed since their last request - serve a 304 and exit
header('HTTP/1.0 304 Not Modified');
exit;
}
}
}
// return $rfc1123 as it may be useful later, eg 'lastBuildDate' for RSS2
return $rfc1123;
}
A typo? The variable $format is defined but not used..
Cecil yes, it's a leftover from when the code had RSS/Atom feeds integrated. Will clean it up later, thanks for pointing it out.
Thanks for an extremely helpful and useful article btw.