|
Hyrule18791 Posts
I needed to make sure dates were in Y-m-d format for some calculations at work, so I wrote this up. It's not perfect, but it's quite enough to handle my needs. So, for anyone who uses PHP and needs to make sure dates are in Y-m-d format (for strtotime() or whatever reason), I grant you checkDateFormat(). It checks for valid formatted dates with space, hyphen, period, or slash separation.
function checkDateFormat($date, $empty = '0', $sep = '/') { $date = preg_replace('/[^\d\-\. \/]/', '', $date); // remove non-numeric and non-date separaters $check = preg_split('/[ \-\.\/]/', $date); if(empty($check)) return '0'; foreach($check as &$item) $item = str_pad($item, 2, '0', STR_PAD_LEFT); $date = implode('-', $check); // matches for all format dates, with space, hyphen, period, or slash separators // correct format needed is Y-m-d // 100 is used as the final check to keep room open for later additions $patterns = array( 0 => '/(9999)[- \.\/](09)[- \.\/](09)/', // one of the many different "blank" values used 1 => '/(0[1-9]|1[012])[- \.\/](0[1-9]|1[012])[- \.\/](\d{4,4})/', // unknown d-m order, Y at end 2 => '/(\d{4,4})[- \.\/](0[1-9]|1[012])[- \.\/](0[1-9]|1[012])/', // unknown d-m order, Y at front 3 => '/(0[1-9]|1[012])[- \.\/](0[1-9]|[12][0-9]|3[01])[- \.\/](\d{4,4})/', // m-d-Y 4 => '/(\d{4,4})[- \.\/](0[1-9]|1[012])[- \.\/](0[1-9]|[12][0-9]|3[01])/', // Y-m-d (aka already correct) 5 => '/(0[1-9]|[12][0-9]|3[01])[- \.\/](0[1-9]|1[012])[- \.\/](\d{4,4})/', // d-m-Y 6 => '/(\d{4,4})[- \.\/](0[1-9]|[12][0-9]|3[01])[- \.\/](0[1-9]|1[012])/', // Y-d-m 99 => '/ |[s(\302\240|\240)]+|[W]+/', // blank, empty, or placeholder 100 => '/(\d{2,2})[- \.\/](\d{2,2})[- \.\/](\d{2,2})/' // unknown all double digits ); // for unknown d-m order assume month is first $replace = array( 0 => $empty, 1 => "$3{$sep}$1{$sep}$2", // unknown d-m order, Y at end -> Y-m-d 2 => "$1{$sep}$2{$sep}$3", // unknown d-m order, Y at front -> Y-m-d 3 => "$3{$sep}$1{$sep}$2", // m-d-Y -> Y-m-d 4 => "$1{$sep}$2{$sep}$3", // Y-m-d -> Y-m-d 5 => "$3{$sep}$2{$sep}$1", // d-m-Y -> Y-m-d 6 => "$3{$sep}$1{$sep}$2", // Y-d-m -> Y-m-d 99 => "$0", 100 => $empty ); foreach($patterns as $index => $pattern) if(preg_match($pattern, $date)) return preg_replace($pattern, $replace[$index], $date); return $empty; }
Yeah, the regular expressions are long, but whatever. I replaced tabs with 2 spaces because it was huge (I develop with tabstop = 4, but the most common setting is 8, which is ugly).
if you want to return the original date, change $replace as follows:
100 => "$1-$2-$3" Easy, eh?
I'll explain it more thoroughly later (ie: when I'm not at work) if anyone wants me to.
[update] Added in checks for single digits (again, my users cannot be trusted!) and some other checks
|
Yea, that looks right......
|
More like "Regex - Date Format Check"
|
|
Hyrule18791 Posts
On July 21 2010 01:46 Dance.jhu wrote: Yea, that looks right...... ^^
On July 21 2010 01:50 Cambium wrote: More like "Regex - Date Format Check" Nah. Regex is a way of expressing patterns, you still need Perl/PHP/somelanguagethathandlesregex to use them. But yeah, it's more regex than PHP :X
On July 21 2010 02:00 gen.Sun wrote: stackoverflow.com Is that a nice way of saying "gtfo"?
|
What is the context of this solution - where's the data coming from? If you can't assume users are entering dates in just one format, then you can't assume they're not going to put the day the day before the month in all ambiguous dates. The clear problem here is the lack of disambiguation with m-d/d-m dates.
|
Hyrule18791 Posts
True, but the data is coming from America (not really what you asked, but eh?), so dates are typically m-d-Y.
As for who's entering it, right now there's only a few people in our office, but this project will be sold to others. My code (elsewhere) checks the whole ambiguous date thing in other ways.
|
There's not really a need for such complicated logic. strtotime accepts any english date format so you could use something like:
$time = strtotime($date); if ($time < 0) throw new Exception("Invalid Date Format: $date"); return date('Y-m-d', $time);
If you're running PHP 5.2 or newer, you can also use the DateTime class. return new DateTime($date); // throws an exception if your date is invalid
|
I don't see why checkdate() wouldn't work for this if you expect the arguments in a certain order. This seems like more of an input problem than a parsing problem.
|
The dates are presumably in text... only reason you would do this
|
On July 21 2010 02:38 tofucake wrote:Is that a nice way of saying "gtfo"?
It's just a better place to ask programming questions, it'll be both faster and better.
|
aers
United States1210 Posts
He's not asking a question, though.
|
Hyrule18791 Posts
On July 21 2010 04:58 Pryce wrote:There's not really a need for such complicated logic. strtotime accepts any english date format so you could use something like: $time = strtotime($date); if ($time < 0) throw new Exception("Invalid Date Format: $date"); return date('Y-m-d', $time); If you're running PHP 5.2 or newer, you can also use the DateTime class. return new DateTime($date); // throws an exception if your date is invalid strtotime() is used, but it doesn't accept any format. This is used for financial transactions (well, displaying them..thousands of them), so a bunch of "Invalid Date Format: $date" displays is not acceptable.
On July 21 2010 05:04 R1CH wrote:I don't see why checkdate() wouldn't work for this if you expect the arguments in a certain order. This seems like more of an input problem than a parsing problem. The people inputting the dates cannot be trusted.
On July 21 2010 10:14 gen.Sun wrote:Show nested quote +On July 21 2010 02:38 tofucake wrote:On July 21 2010 02:00 gen.Sun wrote: stackoverflow.com Is that a nice way of saying "gtfo"? It's just a better place to ask programming questions, it'll be both faster and better. I'm not asking a question, I'm posting a solution to a possible question. Also, I literally just forgot what I was going to say.
|
On July 21 2010 21:02 tofucake wrote:Show nested quote +On July 21 2010 04:58 Pryce wrote:There's not really a need for such complicated logic. strtotime accepts any english date format so you could use something like: $time = strtotime($date); if ($time < 0) throw new Exception("Invalid Date Format: $date"); return date('Y-m-d', $time); If you're running PHP 5.2 or newer, you can also use the DateTime class. return new DateTime($date); // throws an exception if your date is invalid strtotime() is used, but it doesn't accept any format. This is used for financial transactions (well, displaying them..thousands of them), so a bunch of "Invalid Date Format: $date" displays is not acceptable.
See, I just assumed that you were working with large blocks of text, because otherwise regular expressions are one of the worst ways to solve this problem. strtotime() will work with any of the date formats you test for, and will make the same assumption your code does for ambiguous month/day formats. But whatever floats your boat.
|
Hyrule18791 Posts
Yeah I was using just strtotime() before, and it was returning 0 for about half the dates. When I started using my version, all the dates are formatted correctly (and all the calculations are correct).
|
Hyrule18791 Posts
Shameless bump! I updated my original with a few more checks. Works better since I found some more formats in the database :|
|
konadora
Singapore66060 Posts
|
Hyrule18791 Posts
It's really not. Most of that is actually Perl (which is very difficult to read if you're not used to it). The only reason I do something this complicated is because I didn't write the original stuff, and that allowed for (and does) a lot of dumb things.
|
Is this for users to enter dates into a text box and then you check it? If so why go through this if you can just restrict the users from using text boxes and just have drop down boxes or a calendar of some sort. That way you never need to worry about bad user inputs. Its usually better to just not allow users the freedom to do anything they want because they can and WILL break it.
|
Hyrule18791 Posts
Because there are already thousands upon thousands of dates in the database. I didn't write the original site, I've come in to fix it.
|
The database doesn't follow a standard date format? That just sounds horrible...
|
Hyrule18791 Posts
Yeah. "Date" fields are all varchar(12)'s.
|
On July 24 2010 03:19 tofucake wrote: Because there are already thousands upon thousands of dates in the database. I didn't write the original site, I've come in to fix it.
You should fix the dates in the database, and change the schema to use the SQL date type for columns that are meant to store dates. The database will then constrain your inputs and outputs to be sane values (which may require some input validation logic). After that's in place, you can throw the date fixing script away, as that's the type of code that shouldn't live in production code. You'll have a hell of a time unit testing it.
|
I just add a MM/DD/YYYY thing next to every date entry box and write one regex to check if it's valid, just seems to make more sense.
Oh, it's already in the database LOL, that sucks. Though I'm sure there's already a date parser somewhere.
Also, // for unknown d-m order assume month is first is really sketchy >.>
Also also, it's generally better to just parse the date with regex then use another process to check whether the date is valid (e.g. not 99-99-9999) cause it's not very efficient to use so many alternations, which makes the regex engine backtrack quite a bit.
|
On July 24 2010 14:03 Pryce wrote:Show nested quote +On July 24 2010 03:19 tofucake wrote: Because there are already thousands upon thousands of dates in the database. I didn't write the original site, I've come in to fix it. You should fix the dates in the database, and change the schema to use the SQL date type for columns that are meant to store dates. The database will then constrain your inputs and outputs to be sane values (which may require some input validation logic). After that's in place, you can throw the date fixing script away, as that's the type of code that shouldn't live in production code. You'll have a hell of a time unit testing it.
Agreed completely, especially the production code point; this piece of code will be a nightmare to maintain down the road. I'd put the check at the php level or even JS level instead of relying on the db to throw exceptions.
The whole thing just seems completely unnecessary. After all, user stupidity is unbound, and you can't check for everything.
|
Hyrule18791 Posts
I have the date check function because my current assignment is to fix a particular page. Once that's done I'll have a new assignment, and I'm going to be pushing for that to be to fix some things in the database (like dates! but there are bigger problems), and then move the check from being used tens of thousands of times in processing to once to validate input before it's stored.
|
|
|
|