Discussion:
strptime without second argument as an inverse to __str__
Ram Rachum
2014-08-04 16:17:14 UTC
Permalink
What do you think about having `datetime.strptime`, when called without a
`format` for the second argument, be a precise inverse of
`datetime.__str__`? This is because I don't currently see an obvious way to
get an inverse of `datetime.__str__`, and this seems like an okay place to
put it.
Steven D'Aprano
2014-08-04 18:15:28 UTC
Permalink
On Mon, Aug 04, 2014 at 09:17:14AM -0700, Ram Rachum wrote:
> What do you think about having `datetime.strptime`, when called without a
> `format` for the second argument, be a precise inverse of
> `datetime.__str__`? This is because I don't currently see an obvious way to
> get an inverse of `datetime.__str__`, and this seems like an okay place to
> put it.

Is str(datetime) guaranteed to use a specific format, or is that an
implementation detail?


--
Steven
Alexander Belopolsky
2014-08-04 18:40:57 UTC
Permalink
On Mon, Aug 4, 2014 at 2:15 PM, Steven D'Aprano <steve-iDnA/YwAAsAk+I/***@public.gmane.org> wrote:

> > What do you think about having `datetime.strptime`, when called without a
> > `format` for the second argument, be a precise inverse of
> > `datetime.__str__`? This is because I don't currently see an obvious way
> to
> > get an inverse of `datetime.__str__`, and this seems like an okay place
> to
> > put it.
>
> Is str(datetime) guaranteed to use a specific format, or is that an
> implementation detail?


Why is this question relevant for Ram's proposal? As long as str(datetime)
is guaranteed to be different for different datetimes, one should be able
to implement an inverse. The inverse function should accept ISO format
(with either ' ' or 'T' separator) and str(datetime) if it is different in
the implementation.

I agree that datetime type should provide a simple way to construct
instances from well-formatted strings, but I don't think
datetime.strptime() is a good choice of name. I would much rather have
date(str), time(str) and datetime(str) constructors.
Skip Montanaro
2014-08-04 19:00:17 UTC
Permalink
On Mon, Aug 4, 2014 at 1:40 PM, Alexander Belopolsky
<alexander.belopolsky-***@public.gmane.org> wrote:
> Why is this question relevant for Ram's proposal?

It would seem to have some impact on how hard it is to create a
general inverse. Will one format work for all platforms ("one and
done"), or will the inverse implementation potentially have to be
updated as new platforms come into (or go out of) existence?

Also, would the creation of such an inverse lock the implementation
into existing format(s)? For example, when fed a datetime object, the
CSV module will stringify it for output. If I create a CSV file with
one version of Python, then read it into another version of Python (or
on a different platform), it's not unreasonable that I would expect
one-argument strptime() to parse it. That would lock you into a
specific format. If only one format exists today, no big deal, bless
it and move on.

Skip
Alexander Belopolsky
2014-08-04 19:23:08 UTC
Permalink
On Mon, Aug 4, 2014 at 3:00 PM, Skip Montanaro <skip-e+***@public.gmane.org> wrote:

> > Why is this question relevant for Ram's proposal?
>
> It would seem to have some impact on how hard it is to create a
> general inverse. Will one format work for all platforms ("one and
> done"), or will the inverse implementation potentially have to be
> updated as new platforms come into (or go out of) existence?


I think str(datetime) format is an implementation detail to the same extent
as str(int) or str(float) is. In the past, these variations did not
prevent providing (sometimes imperfect) inverse.
Skip Montanaro
2014-08-04 20:14:11 UTC
Permalink
On Mon, Aug 4, 2014 at 2:23 PM, Alexander Belopolsky
<alexander.belopolsky-***@public.gmane.org> wrote:
>
> On Mon, Aug 4, 2014 at 3:00 PM, Skip Montanaro <skip-e+***@public.gmane.org> wrote:
>>
>> > Why is this question relevant for Ram's proposal?
>>
>> It would seem to have some impact on how hard it is to create a
>> general inverse. Will one format work for all platforms ("one and
>> done"), or will the inverse implementation potentially have to be
>> updated as new platforms come into (or go out of) existence?
>
>
> I think str(datetime) format is an implementation detail to the same extent
> as str(int) or str(float) is. In the past, these variations did not prevent
> providing (sometimes imperfect) inverse.

I took a look at whatever version of CPython I have laying about (some
variant of 2.7). str(datetime) seems to be well-defined as calling
isoformat with " " as the separator. The only caveat is that if the
microsecond field is zero, it's omitted.

If that behavior holds true in 3.x, only two cases require consideration:

%Y-%m-%d %H:%M:%S
%Y-%m-%d %H:%M:%S.%f

Skip
Wolfgang Maier
2014-08-04 20:56:56 UTC
Permalink
On 04.08.2014 22:14, Skip Montanaro wrote:
> On Mon, Aug 4, 2014 at 2:23 PM, Alexander Belopolsky
> <alexander.belopolsky-***@public.gmane.org> wrote:
>>
>> On Mon, Aug 4, 2014 at 3:00 PM, Skip Montanaro <skip-e+***@public.gmane.org> wrote:
>>>
>>>> Why is this question relevant for Ram's proposal?
>>>
>>> It would seem to have some impact on how hard it is to create a
>>> general inverse. Will one format work for all platforms ("one and
>>> done"), or will the inverse implementation potentially have to be
>>> updated as new platforms come into (or go out of) existence?
>>
>>
>> I think str(datetime) format is an implementation detail to the same extent
>> as str(int) or str(float) is. In the past, these variations did not prevent
>> providing (sometimes imperfect) inverse.
>
> I took a look at whatever version of CPython I have laying about (some
> variant of 2.7). str(datetime) seems to be well-defined as calling
> isoformat with " " as the separator. The only caveat is that if the
> microsecond field is zero, it's omitted.
>
> If that behavior holds true in 3.x, only two cases require consideration:
>

it does hold true in 3.x, but the documented behavior is slightly more
complex (I assume also in 2.x):

datetime.__str__()
For a datetime instance d, str(d) is equivalent to d.isoformat(' ').

datetime.isoformat(sep='T')

Return a string representing the date and time in ISO 8601 format,
YYYY-MM-DDTHH:MM:SS.mmmmmm or, if microsecond is 0, YYYY-MM-DDTHH:MM:SS

If utcoffset() does not return None, a 6-character string is
appended, giving the UTC offset in (signed) hours and minutes:
YYYY-MM-DDTHH:MM:SS.mmmmmm+HH:MM or, if microsecond is 0
YYYY-MM-DDTHH:MM:SS+HH:MM

The optional argument sep (default 'T') is a one-character
separator, placed between the date and time portions of the result.

> %Y-%m-%d %H:%M:%S
> %Y-%m-%d %H:%M:%S.%f
>

=> plus timezone versions of the above.

Wolfgang
Steven D'Aprano
2014-08-05 01:39:44 UTC
Permalink
On Mon, Aug 04, 2014 at 10:56:56PM +0200, Wolfgang Maier wrote:
[...]
> it does hold true in 3.x, but the documented behavior is slightly more
> complex (I assume also in 2.x):
>
> datetime.__str__()
> For a datetime instance d, str(d) is equivalent to d.isoformat(' ').

Since str(d) is documented to use a well-defined format, then I agree
that it makes sense to make the second argument to d.strptime optional,
and default to that same format. The concern I had was the sort of
scenario Skip suggested: I might write out a datetime object as a string
on one machine, where the format is X, and read it back elsewhere, where
the format is Y, leading to at best an exception and at worse incorrect
data.

+1 on the suggestion.


--
Steven
Wolfgang Maier
2014-08-05 21:22:11 UTC
Permalink
On 05.08.2014 03:39, Steven D'Aprano wrote:
>
> Since str(d) is documented to use a well-defined format, then I agree
> that it makes sense to make the second argument to d.strptime optional,
> and default to that same format. The concern I had was the sort of
> scenario Skip suggested: I might write out a datetime object as a string
> on one machine, where the format is X, and read it back elsewhere, where
> the format is Y, leading to at best an exception and at worse incorrect
> data.
>
> +1 on the suggestion.
>

After looking a bit into the code of the datetime module, I am not
convinced anymore that strptime() is the right place for the
functionality for the following reasons:

1) strptime already has a clear counterpart and that's strftime.

2) strftime/strptime use explicit format strings, not any more
sophisticated parsing (as would be required to parse the different
formats that datetime.__str__ can produce) and they try, intentionally,
to mimick the behavior of their C equivalents.

In other words, strftime/strptime have a very clear underlying concept,
which IMO should not be given up just because we are trying to stuff
some extra-functionality into them.

That said, I still think that the basic idea - being able to
reverse-parse the output of datetime.__str__ - is right.

I would suggest that a better place for this is an additional
classmethod constructor (the datetime class already has quite a number
of them). Maybe fromisostring() could be a suitable name ?
With this you could even pass an extra-argument for the date-time
separator just like with the current isoformat.
This constructor would then be more like a counterpart to
datetime.isoformat(), but it could simply be documented that calling it
with fromisostring(datestring, sep=" ") can be used to parse strings
written with datetime.str().

-1 on the specifics of the proposal,
+1 on the general idea.
Jonas Wielicki
2014-08-05 21:28:22 UTC
Permalink
On 05.08.2014 23:22, Wolfgang Maier wrote:
> On 05.08.2014 03:39, Steven D'Aprano wrote:
>>
>> Since str(d) is documented to use a well-defined format, then I agree
>> that it makes sense to make the second argument to d.strptime optional,
>> and default to that same format. The concern I had was the sort of
>> scenario Skip suggested: I might write out a datetime object as a string
>> on one machine, where the format is X, and read it back elsewhere, where
>> the format is Y, leading to at best an exception and at worse incorrect
>> data.
>>
>> +1 on the suggestion.
>>
>
> After looking a bit into the code of the datetime module, I am not
> convinced anymore that strptime() is the right place for the
> functionality for the following reasons:
>
> 1) strptime already has a clear counterpart and that's strftime.
>
> 2) strftime/strptime use explicit format strings, not any more
> sophisticated parsing (as would be required to parse the different
> formats that datetime.__str__ can produce) and they try, intentionally,
> to mimick the behavior of their C equivalents.
>
> In other words, strftime/strptime have a very clear underlying concept,
> which IMO should not be given up just because we are trying to stuff
> some extra-functionality into them.
>
> That said, I still think that the basic idea - being able to
> reverse-parse the output of datetime.__str__ - is right.
>
> I would suggest that a better place for this is an additional
> classmethod constructor (the datetime class already has quite a number
> of them). Maybe fromisostring() could be a suitable name ?

Maybe rather fromisoformat(), to stay analogous with the formatting method?

> With this you could even pass an extra-argument for the date-time
> separator just like with the current isoformat.
> This constructor would then be more like a counterpart to
> datetime.isoformat(), but it could simply be documented that calling it
> with fromisostring(datestring, sep=" ") can be used to parse strings
> written with datetime.str().
>
> -1 on the specifics of the proposal,
> +1 on the general idea.

+1 for this rating.

>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
Andrew Barnert
2014-08-05 21:35:06 UTC
Permalink
On Aug 5, 2014, at 14:22, Wolfgang Maier <wolfgang.maier-***@public.gmane.org> wrote:

> On 05.08.2014 03:39, Steven D'Aprano wrote:
>>
>> Since str(d) is documented to use a well-defined format, then I agree
>> that it makes sense to make the second argument to d.strptime optional,
>> and default to that same format. The concern I had was the sort of
>> scenario Skip suggested: I might write out a datetime object as a string
>> on one machine, where the format is X, and read it back elsewhere, where
>> the format is Y, leading to at best an exception and at worse incorrect
>> data.
>>
>> +1 on the suggestion.
>
> After looking a bit into the code of the datetime module, I am not convinced anymore that strptime() is the right place for the functionality for the following reasons:
>
> 1) strptime already has a clear counterpart and that's strftime.
>
> 2) strftime/strptime use explicit format strings, not any more sophisticated parsing (as would be required to parse the different formats that datetime.__str__ can produce) and they try, intentionally, to mimick the behavior of their C equivalents.
>
> In other words, strftime/strptime have a very clear underlying concept, which IMO should not be given up just because we are trying to stuff some extra-functionality into them.

What if strftime _also_ allowed the format string to be omitted, in which case it would produce the same format as str? Then they would remain perfect inverses.

> That said, I still think that the basic idea - being able to reverse-parse the output of datetime.__str__ - is right.
>
> I would suggest that a better place for this is an additional classmethod constructor (the datetime class already has quite a number of them). Maybe fromisostring() could be a suitable name ?
> With this you could even pass an extra-argument for the date-time separator just like with the current isoformat.
> This constructor would then be more like a counterpart to datetime.isoformat(), but it could simply be documented that calling it with fromisostring(datestring, sep=" ") can be used to parse strings written with datetime.str().

Wouldn't you expect a method called fromisostring to be able to parse any valid ISO string, especially given that there are third-party libs with functions named fromisoformat that do exactly that, and people suggest adding one of them to the stdlib every few months?

What you want to get across is that this function parses the default Python representation of datetimes; the fact that it happens to be a subset of ISO format doesn't seem as relevant here. I like the idea of a new alternate constructor, I'm just not crazy about the name.

>
> -1 on the specifics of the proposal,
> +1 on the general idea.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
Petr Viktorin
2014-08-05 21:46:10 UTC
Permalink
On Tue, Aug 5, 2014 at 11:35 PM, Andrew Barnert
<abarnert-/***@public.gmane.org> wrote:
> On Aug 5, 2014, at 14:22, Wolfgang Maier <wolfgang.maier-***@public.gmane.org> wrote:
>
>> On 05.08.2014 03:39, Steven D'Aprano wrote:
>>>
>>> Since str(d) is documented to use a well-defined format, then I agree
>>> that it makes sense to make the second argument to d.strptime optional,
>>> and default to that same format. The concern I had was the sort of
>>> scenario Skip suggested: I might write out a datetime object as a string
>>> on one machine, where the format is X, and read it back elsewhere, where
>>> the format is Y, leading to at best an exception and at worse incorrect
>>> data.
>>>
>>> +1 on the suggestion.
>>
>> After looking a bit into the code of the datetime module, I am not convinced anymore that strptime() is the right place for the functionality for the following reasons:
>>
>> 1) strptime already has a clear counterpart and that's strftime.
>>
>> 2) strftime/strptime use explicit format strings, not any more sophisticated parsing (as would be required to parse the different formats that datetime.__str__ can produce) and they try, intentionally, to mimick the behavior of their C equivalents.
>>
>> In other words, strftime/strptime have a very clear underlying concept, which IMO should not be given up just because we are trying to stuff some extra-functionality into them.
>
> What if strftime _also_ allowed the format string to be omitted, in which case it would produce the same format as str? Then they would remain perfect inverses.

+1

>
>> That said, I still think that the basic idea - being able to reverse-parse the output of datetime.__str__ - is right.
>>
>> I would suggest that a better place for this is an additional classmethod constructor (the datetime class already has quite a number of them). Maybe fromisostring() could be a suitable name ?
>> With this you could even pass an extra-argument for the date-time separator just like with the current isoformat.
>> This constructor would then be more like a counterpart to datetime.isoformat(), but it could simply be documented that calling it with fromisostring(datestring, sep=" ") can be used to parse strings written with datetime.str().
>
> Wouldn't you expect a method called fromisostring to be able to parse any valid ISO string, especially given that there are third-party libs with functions named fromisoformat that do exactly that, and people suggest adding one of them to the stdlib every few months?
>
> What you want to get across is that this function parses the default Python representation of datetimes; the fact that it happens to be a subset of ISO format doesn't seem as relevant here. I like the idea of a new alternate constructor, I'm just not crazy about the name.

Let me just note this, since it hasn't been said here yet:

When people say "iso" in the context of datestimes, they usually mean RFC 3339.

As Wikipedia can tell you, ISO 8601 is a big complicated non-public
specification under which today can be written as:
- 2014-08-05
- 2014-W32-2
- 2014-217
... and by now I can see why there's no ISO 8601 parser in the stdlib.

RFC 3339, on the other hand, specifies one specific variant of ISO
8601: the one we're all used to, and which datetime's isoformat and
__str__ return. (Just about the only exception is that to be
compatible with ISO 8601, it still specifies "T"/"t" for the separator
and graciously lets people agree on space.)
Terry Reedy
2014-08-05 23:12:47 UTC
Permalink
On 8/5/2014 5:35 PM, Andrew Barnert wrote:
> On Aug 5, 2014, at 14:22, Wolfgang Maier <wolfgang.maier-***@public.gmane.org> wrote:
>
>> On 05.08.2014 03:39, Steven D'Aprano wrote:
>>>
>>> Since str(d) is documented to use a well-defined format, then I agree
>>> that it makes sense to make the second argument to d.strptime optional,
>>> and default to that same format. The concern I had was the sort of
>>> scenario Skip suggested: I might write out a datetime object as a string
>>> on one machine, where the format is X, and read it back elsewhere, where
>>> the format is Y, leading to at best an exception and at worse incorrect
>>> data.
>>>
>>> +1 on the suggestion.
>>
>> After looking a bit into the code of the datetime module, I am not convinced anymore that strptime() is the right place for the functionality for the following reasons:
>>
>> 1) strptime already has a clear counterpart and that's strftime.
>>
>> 2) strftime/strptime use explicit format strings, not any more sophisticated parsing (as would be required to parse the different formats that datetime.__str__ can produce) and they try, intentionally, to mimick the behavior of their C equivalents.
>>
>> In other words, strftime/strptime have a very clear underlying concept, which IMO should not be given up just because we are trying to stuff some extra-functionality into them.
>
> What if strftime _also_ allowed the format string to be omitted, in which case it would produce the same format as str? Then they would remain perfect inverses.
>
>> That said, I still think that the basic idea - being able to reverse-parse the output of datetime.__str__ - is right.
>>
>> I would suggest that a better place for this is an additional classmethod constructor (the datetime class already has quite a number of them). Maybe fromisostring() could be a suitable name ?
>> With this you could even pass an extra-argument for the date-time separator just like with the current isoformat.
>> This constructor would then be more like a counterpart to datetime.isoformat(), but it could simply be documented that calling it with fromisostring(datestring, sep=" ") can be used to parse strings written with datetime.str().
>
> Wouldn't you expect a method called fromisostring to be able to parse any valid ISO string, especially given that there are third-party libs with functions named fromisoformat that do exactly that, and people suggest adding one of them to the stdlib every few months?

Probably yes

> What you want to get across is that this function parses the default Python representation of datetimes; the fact that it happens to be a subset of ISO format doesn't seem as relevant here. I like the idea of a new alternate constructor, I'm just not crazy about the name.

Given that str(dti) (datetime instance) is conceptually dt.tostr(dit),
name the inverse as dti = dt.fromstr(s).

--
Terry Jan Reedy
Wolfgang Maier
2014-08-06 08:35:57 UTC
Permalink
On 05.08.2014 23:35, Andrew Barnert wrote:
> On Aug 5, 2014, at 14:22, Wolfgang Maier <wolfgang.maier-***@public.gmane.org> wrote:
>
>> On 05.08.2014 03:39, Steven D'Aprano wrote:
>>>
>>> Since str(d) is documented to use a well-defined format, then I agree
>>> that it makes sense to make the second argument to d.strptime optional,
>>> and default to that same format. The concern I had was the sort of
>>> scenario Skip suggested: I might write out a datetime object as a string
>>> on one machine, where the format is X, and read it back elsewhere, where
>>> the format is Y, leading to at best an exception and at worse incorrect
>>> data.
>>>
>>> +1 on the suggestion.
>>
>> After looking a bit into the code of the datetime module, I am not convinced anymore that strptime() is the right place for the functionality for the following reasons:
>>
>> 1) strptime already has a clear counterpart and that's strftime.
>>
>> 2) strftime/strptime use explicit format strings, not any more sophisticated parsing (as would be required to parse the different formats that datetime.__str__ can produce) and they try, intentionally, to mimick the behavior of their C equivalents.
>>
>> In other words, strftime/strptime have a very clear underlying concept, which IMO should not be given up just because we are trying to stuff some extra-functionality into them.
>
> What if strftime _also_ allowed the format string to be omitted, in which case it would produce the same format as str? Then they would remain perfect inverses.
>

Yes, but strftime without format string would then be completely
redundant with __str__ and isoformat with " " separator, which is really
quite against the one and only one way of doing things idea.

Plus again, right now strftime takes an explicit format string and then
generates a datetime string with exactly this and only this format.
In the optional format string scenario, it would have to generate
slightly differently formatted output depending on whether there is
microseconds and/or timezone information. So, like for strptime, this
would change the very clearly defined current behavior into a mix of
things, unnecessarily.

>> That said, I still think that the basic idea - being able to reverse-parse the output of datetime.__str__ - is right.
>>
>> I would suggest that a better place for this is an additional classmethod constructor (the datetime class already has quite a number of them). Maybe fromisostring() could be a suitable name ?
>> With this you could even pass an extra-argument for the date-time separator just like with the current isoformat.
>> This constructor would then be more like a counterpart to datetime.isoformat(), but it could simply be documented that calling it with fromisostring(datestring, sep=" ") can be used to parse strings written with datetime.str().
>
> Wouldn't you expect a method called fromisostring to be able to parse any valid ISO string, especially given that there are third-party libs with functions named fromisoformat that do exactly that, and people suggest adding one of them to the stdlib every few months?
>
> What you want to get across is that this function parses the default Python representation of datetimes; the fact that it happens to be a subset of ISO format doesn't seem as relevant here. I like the idea of a new alternate constructor, I'm just not crazy about the name.
>

Fair enough, it was just the first half-reasonable thing that came to my
mind :)
Being able to parse any valid ISO string would be another nice feature,
but it's really a different story.

Wolfgang
Ethan Furman
2014-08-06 10:34:49 UTC
Permalink
Loading...