adding dictionaries

Post by Alexander Heger
Is there a good reason for not implementing the "+" operator for dict.update()?

[...]

Post by Alexander Heger
That is
B += A
should be equivalent to
B.update(A)

You're asking the wrong question. The burden is not on people to justify
*not* adding new features, the burden is on somebody to justify adding
them. Is there a good reason for implementing the + operator as
dict.update? We can already write B.update(A), under what circumstances
would you spell it B += A instead, and why?

Post by Alexander Heger
It would be even better if there was also a regular "addition"
operator that is equivalent to creating a shallow copy and then
C = A + B
should equal to
C = dict(A)
C.update(B)

That would be spelled C = dict(A, **B).

I'd be more inclined to enhance the dict constructor and update methods
so you can provide multiple arguments:

dict(A, B, C, D) # Rather than A + B + C + D
D.update(A, B, C) # Rather than D += A + B + C

Post by Alexander Heger
My apologies if this has been posted before but with a quick google
search I could not see it; if it was, could you please point me to the
thread? I assume this must be a design decision that has been made a
long time ago, but it is not obvious to me why.

I'm not sure it's so much a deliberate decision not to implement
dictionary addition, as uncertainty as to what dictionary addition ought
to mean. Given two dicts:

A = {'a': 1, 'b': 1}
B = {'a': 2, 'c': 2}

I can think of at least four things that C = A + B could do:

# add values, defaulting to 0 for missing keys
C = {'a': 3, 'b': 1, 'c': 2}

# add values, raising KeyError if there are missing keys

# shallow copy of A, update with B
C = {'a': 2, 'b': 1, 'c': 2}

# shallow copy of A, insert keys from B only if not already in A
C = {'a': 1, 'b': 1, 'c': 2}

Except for the second one, I've come across people suggesting that each
of the other three is the one and only obvious thing for A+B to do.

--
Steven

Joshua Landau

2014-07-28 05:26:13 UTC

Post by Alexander Heger
Is there a good reason for not implementing the "+" operator for dict.update()?

[...]

Post by Alexander Heger
That is
B += A
should be equivalent to
B.update(A)

One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

But really, we have collections.ChainMap, dict addition is confusing
and there's already a PEP (python.org/dev/peps/pep-0448) that has a
solution I prefer ({**A, **B}).

Steven D'Aprano

2014-07-28 14:59:51 UTC

[...]

Post by Joshua Landau

Post by Steven D'Aprano
Is there a good reason for implementing the + operator as
dict.update?

One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

--
Steven

dw+

2014-07-28 15:33:06 UTC

Post by Joshua Landau
One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

It worked in Python 2, but Python 3 added code to explicitly prevent the
kwargs mechanism from being abused by passing non-string keys.
Effectively, the only reason it worked was due to a Python 2.x kwargs
implementation detail.

It took me a while to come to terms with this one too, it was really
quite a nice hack. But that's all it ever was. The domain of valid keys
accepted by **kwargs should never have exceeded the range supported by
the language syntax for declaring keyword arguments.

David

Steven D'Aprano

2014-07-28 16:04:50 UTC

Post by dw+

Post by Joshua Landau
One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

It worked in Python 2, but Python 3 added code to explicitly prevent the
kwargs mechanism from being abused by passing non-string keys.

/face-palm

Ah of course! You're right, using dict(A, **B) isn't general enough.

I'm still inclined to prefer allowing update() to accept multiple
arguments:

a.update(b, c, d)

rather than a += b + c + d

which suggests that maybe there ought to be an updated() built-in, Let
the bike-shedding begin: should such a thing be spelled ?

new_dict = a + b + c + d

Pros: + is short to type; subclasses can control the type of new_dict.
Cons: dict addition isn't obvious.

new_dict = updated(a, b, c, d)

Pros: analogous to sort/sorted, reverse/reversed.
Cons: another built-in; isn't very general, only applies to Mappings

new_dict = a.updated(b, c, d)

Pros: only applies to mappings, so it should be a method; subclasses can
control the type of the new dict returned.
Cons: easily confused with dict.update

--
Steven

Ron Adam

2014-07-28 17:17:10 UTC

Post by dw+

Post by Joshua Landau
One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

It worked in Python 2, but Python 3 added code to explicitly prevent the
kwargs mechanism from being abused by passing non-string keys.

/face-palm
Ah of course! You're right, using dict(A, **B) isn't general enough.

and make the language easier to write and use

Post by Steven D'Aprano
I'm still inclined to prefer allowing update() to accept multiple
a.update(b, c, d)

To me, the constructor and update method should be as near alike as possible.

So I think if it's done in the update method, it should also work in the
constructor. And other type constructors, such as list, should work in
similar ways as well. I'm not sure that going in this direction would be
good in the long term.

Post by Steven D'Aprano
rather than a += b + c + d
which suggests that maybe there ought to be an updated() built-in, Let
the bike-shedding begin: should such a thing be spelled ?
new_dict = a + b + c + d
Pros: + is short to type; subclasses can control the type of new_dict.
Cons: dict addition isn't obvious.

I think it's more obvious. It only needs __add__ and __iadd__ methods to
make it consistent with the list type.

The cons is that somewhere someone could be catching TypeError to
differentiate dict from other types while adding. But it's just as likely
they are doing so in order to add them after a TypeError occurs.

I think this added consistency between lists and dicts would be useful.

But, Putting __add__ and __iadd__ methods on dicts seems like something
that was probably discussed in length before, and I wonder what reasons
where given for not doing it then.

Cheers,
Ron

Steven D'Aprano

2014-07-29 03:34:12 UTC

[...]

Post by Ron Adam

Post by Steven D'Aprano
new_dict = a + b + c + d
Pros: + is short to type; subclasses can control the type of new_dict.
Cons: dict addition isn't obvious.

I think it's more obvious. It only needs __add__ and __iadd__ methods to
make it consistent with the list type.

What I meant was that it wasn't obvious what dict1 + dict2 should do,
not whether or not the __add__ method exists.

Post by Ron Adam
I think this added consistency between lists and dicts would be useful.

Lists and dicts aren't the same kind of object. I'm not sure it is
helpful to force them to be consistent. Should list grow an update()
method to make it consistent with dicts? How about setdefault()?

As for being useful, useful for what? Useful how often? I'm sure that
one could take any piece of code, no matter how obscure, and say it is
useful *somewhere* :-) but the question is whether it is useful enough
to be part of the language.

I was wrong to earlier dismiss the OP's usecase for dict addition by
suggestion dict(a, **b). Such a thing only works if all the keys of b
are valid identifiers. But that doesn't mean that just because my
shoot-from-the-hip response missed the target that we should conclude
that dict addition solves an important problem or that + is the correct
way to spell it.

I'm still dubious that it's needed, but if it were, this is what I
would prefer to see:

* should be a Mapping method, not a top-level function;

* should accept anything the dict constructor accepts, mappings or
lists of (key,value) pairs as well as **kwargs;

* my prefered name for this is now "merged" rather than "updated";

* it should return a new mapping, not modify in-place;

* when called from a class, it should behave like a class method:
MyMapping.merged(a, b, c) should return an instance of MyMapping;

* but when called from an instance, it should behave like an instance
method, with self included in the chain of mappings to merge:
a.merged(b, c) rather than a.merged(a, b, c).

I have a descriptor type which implements the behaviour from the last
two bullet points, so from a technical standpoint it's not hard to
implement this. But I can imagine a lot of push-back from the more
conservative developers about adding a *fourth* method type (even if it
is private) to the Python builtins, so it would take a really compelling
use-case to justify adding a new method type and a new dict method.

(Personally, I think this hybrid class/instance method type is far more
useful than staticmethod, since I've actually used it in production
code, but staticmethod isn't going away.)

--
Steven

Andrew Barnert

2014-07-29 06:15:44 UTC

Received: from localhost (HELO mail.python.org) (127.0.0.1)
by albatross.python.org with SMTP; 29 Jul 2014 08:18:41 +0200
Received: from nm3-vm5.access.bullet.mail.bf1.yahoo.com (unknown
[216.109.114.100])
(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
(No client certificate requested)
by mail.python.org (Postfix) with ESMTPS
for <python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>; Tue, 29 Jul 2014 08:18:41 +0200 (CEST)
Received: from [66.196.81.163] by nm3.access.bullet.mail.bf1.yahoo.com with
NNFMP; 29 Jul 2014 06:15:45 -0000
Received: from [66.196.81.141] by tm9.access.bullet.mail.bf1.yahoo.com with
NNFMP; 29 Jul 2014 06:15:45 -0000
Received: from [127.0.0.1] by omp1017.access.mail.bf1.yahoo.com with NNFMP;
29 Jul 2014 06:15:45 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 391041.5153.bm-uDME0X8elRYCEQ0RM/j68tQH/bERru8+/***@public.gmane.org
Received: (qmail 84547 invoked by uid 60001); 29 Jul 2014 06:15:44 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
t06614544; bh=DJe/j3inmZPYk9dE+LnwCjWB+As+S2dXz916lqLLmus=;
h=References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
b=pFGK3t35SI7Vqq2qjjriQkuIwIJFnJyj8PGurtvQg+Cb/M1mUIz2Sn4Evd2Ajk9zk7npEVJmxYK4+vS2J26qoE2MSw5zIRkXm2/4lhoCuzAfUFtZmkMrtiso3w34ZClISDEvmEollr985clX03WHCuR4ecgyIlzC4wTqkoqp5bMX-YMail-OSG: .ahBSHwVM1lKpfHMKoO9sQFp.kekUDlzLiIDtfZFkYXozX7
o_cOvm_Wgi0pgs1e4hCqK.8n_ZLzRHnLagctIek5BcxDPvJQX4KRL7vlLx5d
qVFm4ZI0Ki1dxkmvZ_MMiL98RI8kPaZYML304ZMebM7CLxqHAm9U4dGcrAp_
HysoPTqXv8Jxa5chfWEutPBjFhBiTK0Kq7eSmMWJV0wz1WXLAQH3cOzZiNK1
5hGerpC1J0JlcYw6.aU3MA3t.j2O6nn95qs4G4U4aMH0hH.6147OoyZwMI6m
1k_.J1yM3D12aKuee8mFhiBo2Zd.6.eLSfwlxq_j5QEJ8CZt7VQuO5H5HJ2J
WaX9MO8ObHnoGmuaZwN50U5aEXqcaUW7ONWthQ8_KeyN841GaTxHdc9bfbkO
y5O6L5wB_Mm2umUxyq5W4o.j4zTRpv1SgXz5jCg1fK9nXm44At6BmJX5ZxfZ
U0ImbzyoAnxS8w_dgJ5RhjSjUMOy8YWLQerHqXTvgevDoB5SKJhCgLlyuPbN
aQlnFRfG0Pav5eeC_Uyw39MhvbfGvQnnjv8ytomqDtg6ewkPljYNqWQ--
Received: from [173.228.85.123] by web181002.mail.ne1.yahoo.com via HTTP;
Mon, 28 Jul 2014 23:15:44 PDT
X-Rocket-MIMEInfo: 002.001,
T24gTW9uZGF5LCBKdWx5IDI4LCAyMDE0IDg6MzQgUE0sIFN0ZXZlbiBEJ0FwcmFubyA8c3RldmVAcGVhcndvb2QuaW5mbz4gd3JvdGU6CgoKCltzbmlwXQoKPiAqIHdoZW4gY2FsbGVkIGZyb20gYSBjbGFzcywgaXQgc2hvdWxkIGJlaGF2ZSBsaWtlIGEgY2xhc3MgbWV0aG9kOiAKPiDCoCBNeU1hcHBpbmcubWVyZ2VkKGEsIGIsIGMpIHNob3VsZCByZXR1cm4gYW4gaW5zdGFuY2Ugb2YgTXlNYXBwaW5nOwo.IAo.ICogYnV0IHdoZW4gY2FsbGVkIGZyb20gYW4gaW5zdGFuY2UsIGl0IHNob3VsZCBiZWhhdmUgbGkBMAEBAQE-
X-Mailer: YahooMailWebService/0.8.196.685
In-Reply-To: <***@ando>
X-BeenThere: python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Discussions of speculative Python language ideas
<python-ideas.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-ideas>,
<mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=unsubscribe>
List-Archive: <http://mail.python.org/pipermail/python-ideas/>
List-Post: <mailto:python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>
List-Help: <mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-ideas>,
<mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=subscribe>
Errors-To: python-ideas-bounces+gcpi-python-ideas=m.gmane.org-+ZN9ApsXKcEdnm+***@public.gmane.org
Sender: "Python-ideas"
<python-ideas-bounces+gcpi-python-ideas=m.gmane.org-+ZN9ApsXKcEdnm+***@public.gmane.org>
Archived-At: <http://permalink.gmane.org/gmane.comp.python.ideas/28471>

On Monday, July 28, 2014 8:34 PM, Steven D'Aprano <steve-iDnA/YwAAsAk+I/***@public.gmane.org> wrote:

[snip]

MyMapping.merged(a, b, c) should return an instance of MyMapping;
* but when called from an instance, it should behave like an instance
a.merged(b, c) rather than a.merged(a, b, c).
I have a descriptor type which implements the behaviour from the last
two bullet points, so from a technical standpoint it's not hard to
implement this. But I can imagine a lot of push-back from the more
conservative developers about adding a *fourth* method type (even if it
is private) to the Python builtins, so it would take a really compelling
use-case to justify adding a new method type and a new dict method.
(Personally, I think this hybrid class/instance method type is far more
useful than staticmethod, since I've actually used it in production
code, but staticmethod isn't going away.)

How is this different from a plain-old (builtin or normal) method?
... def eggs(self, a):
... print(self, a)

spam = Spam()
Spam.eggs(spam, 2)

<__main__.Spam object at 0x106377080> 2

spam.eggs(2)

<__main__.Spam object at 0x106377080> 2

Spam.eggs

spam.eggs

s = {1, 2, 3}
set.union(s, [4])

{1, 2, 3, 4}

s.union([4])

{1, 2, 3, 4}

set.union

s.union

<function union>

This is the way methods have always worked (although the details of how they worked under the covers changed in 3.0, and before that when descriptors and new-style classes were added).

Steven D'Aprano

2014-07-29 13:35:56 UTC

Received: from localhost (HELO mail.python.org) (127.0.0.1)
by albatross.python.org with SMTP; 29 Jul 2014 15:41:10 +0200
Received: from ipmail05.adl6.internode.on.net (unknown [150.101.137.143])
by mail.python.org (Postfix) with ESMTP
for <python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>; Tue, 29 Jul 2014 15:41:08 +0200 (CEST)
Received: from ppp118-209-248-249.lns20.mel6.internode.on.net (HELO
pearwood.info) ([118.209.248.249])
by ipmail05.adl6.internode.on.net with ESMTP; 29 Jul 2014 23:05:59 +0930
Received: by pearwood.info (Postfix, from userid 1000)
id 33E08120605; Tue, 29 Jul 2014 23:35:56 +1000 (EST)
Content-Disposition: inline
In-Reply-To: <1406614544.48360.YahooMailNeo-zpER/x4socO2Y7dhQGSVAJOW+***@public.gmane.org>
User-Agent: Mutt/1.4.2.2i
X-BeenThere: python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Discussions of speculative Python language ideas
<python-ideas.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-ideas>,
<mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=unsubscribe>
List-Archive: <http://mail.python.org/pipermail/python-ideas/>
List-Post: <mailto:python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>
List-Help: <mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-ideas>,
<mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=subscribe>
Errors-To: python-ideas-bounces+gcpi-python-ideas=m.gmane.org-+ZN9ApsXKcEdnm+***@public.gmane.org
Sender: "Python-ideas"
<python-ideas-bounces+gcpi-python-ideas=m.gmane.org-+ZN9ApsXKcEdnm+***@public.gmane.org>
Archived-At: <http://permalink.gmane.org/gmane.comp.python.ideas/28480>

Post by Andrew Barnert
[snip]

How is this different from a plain-old (builtin or normal) method?

I see I failed to explain clearly, sorry about that.

With class methods, the method always receives the class as the first
argument. Regardless of whether you write dict.fromkeys or
{1:'a'}.fromkeys, the first argument is the class, dict.

With instance methods, the method receives the instance. If you call it
from a class, the method is "unbound" and you are responsible for
providing the "self" argument.

To me, this hypothetical merged() method sometimes feels like an
alternative constructor, like fromkeys, and therefore best written as a
class method, but sometimes like a regular method. Since it feels like a
hybrid to me, I think a hybrid descriptor approach is best, but as I
already said I can completely understand if conservative developers
reject this idea.

In the hybrid form I'm referring to, the first argument provided is the
class when called from the class, and the instance when called from an
instance. Imagine it written in pure Python like this:

class dict:
@hybridmethod
def merged(this, *args, **kwargs):
if isinstance(this, type):
# Called from the class
new = this()
else:
# Called from an instance.
new = this.copy()
for arg in args:
new.update(arg)
new.update(kwargs)
return new

If merged is a class method, we can avoid having to worry about the
case where your "a" mapping happens to be a list of (key,item) pairs:

a.merged(b, c, d) # Fails if a = [(key, item), ...]
dict.merged(a, b, c, d) # Always succeeds.

It also allows us to easily specify a different mapping type for the
result:

MyMapping.merged(a, b, c, d)

although some would argue this is just as clear:

MyMapping().merged(a, b, c, d)

albeit perhaps not quite as efficient if MyMapping is expensive to
instantiate. (You create an empty instance, only to throw it away
again.)

On the other hand, there are use-cases where merged() best communicates
the intent if it is a regular instance method. Consider:

settings = application_defaults.merged(
global_settings,
user_settings,
commandline_settings)

seems more clear to me than:

settings = dict.merged(
application_defaults,
global_settings,
user_settings,
commandline_settings)

especially in the case that application_defaults is a dict literal.

tl;dr It's not often that I can't decide whether a method ought to be a
class method or an instance method, the decision is usually easy, but
this is one of those times.

--
Steven

Jonas Wielicki

2014-07-29 14:03:09 UTC

Post by Andrew Barnert
[snip]

Post by Steven D'Aprano
MyMapping.merged(a, b, c) should return an instance of MyMapping;
* but when called from an instance, it should behave like an instance
a.merged(b, c) rather than a.merged(a, b, c).
I have a descriptor type which implements the behaviour from the last
two bullet points, so from a technical standpoint it's not hard to
implement this. But I can imagine a lot of push-back from the more
conservative developers about adding a *fourth* method type (even if it
is private) to the Python builtins, so it would take a really compelling
use-case to justify adding a new method type and a new dict method.
(Personally, I think this hybrid class/instance method type is far more
useful than staticmethod, since I've actually used it in production
code, but staticmethod isn't going away.)

How is this different from a plain-old (builtin or normal) method?

[snip]

Post by Steven D'Aprano
In the hybrid form I'm referring to, the first argument provided is the
class when called from the class, and the instance when called from an
@hybridmethod
# Called from the class
new = this()
# Called from an instance.
new = this.copy()
new.update(arg)
new.update(kwargs)
return new

[snip]

I really like the semantics of that. This allows for concise, and in my
opinion, clearly readable code.

Although I think maybe one should have two separate methods: the class
method being called ``merged`` and the instance method called
``merged_with``. I find

result = somedict.merged(b, c)

somewhat less clear than

result = somedict.merged_with(b, c)

regards,
jwi

Ron Adam

2014-07-29 23:12:16 UTC

Post by Steven D'Aprano
[...]

Post by Ron Adam

Post by Steven D'Aprano
new_dict = a + b + c + d
Pros: + is short to type; subclasses can control the type of new_dict.
Cons: dict addition isn't obvious.

I think it's more obvious. It only needs __add__ and __iadd__ methods to
make it consistent with the list type.

What I meant was that it wasn't obvious what dict1 + dict2 should do,
not whether or not the __add__ method exists.

What else could it do besides return a new copy of dict1 updated with dict2
contents? It's an unordered container, so it wouldn't append, and the
duplicate keys would be resolved based on the order of evaluation. I don't
see any problem with that. I also don't know of any other obvious way to
combine two dictionaries.

The argument against it, may simply be that it's a feature by design, to
have dictionaries unique enough so that code which handles them is clearly
specific to them. I'm not sure how strong that logic is though.

Post by Ron Adam
I think this added consistency between lists and dicts would be useful.

Well, here is how they currently compare.

Post by Ron Adam
set(dir(dict)).intersection(set(dir(list)))

{'copy', '__hash__', '__format__', '__sizeof__', '__ge__', '__delitem__',
'__getitem__', '__dir__', 'pop', '__gt__', '__repr__', '__init__',
'__subclasshook__', '__eq__', 'clear', '__len__', '__str__', '__le__',
'__new__', '__reduce_ex__', '__doc__', '__getattribute__', '__ne__',
'__reduce__', '__contains__', '__delattr__', '__class__', '__lt__',
'__setattr__', '__setitem__', '__iter__'}

Post by Ron Adam
set(dir(dict)).difference(set(dir(list)))

{'popitem', 'update', 'setdefault', 'items', 'values', 'fromkeys', 'get',
'keys'}

Post by Ron Adam
set(dir(list)).difference(set(dir(dict)))

{'sort', '__mul__', 'remove', '__iadd__', '__reversed__', 'insert',
'extend', 'append', 'count', '__add__', '__rmul__', 'index', '__imul__',
'reverse'}

They do have quite a lot in common already. The usefulness of different
types having the same methods is that external code can be less specific to
the objects they handle. Of course, if those like methods act too
differently they can be surprising as well. That may be the case if '+'
and '+=' are used to update dictionaries, but then again, maybe not. (?)

Post by Steven D'Aprano
As for being useful, useful for what? Useful how often? I'm sure that
one could take any piece of code, no matter how obscure, and say it is
useful*somewhere* :-) but the question is whether it is useful enough
to be part of the language.

That's where examples will have an advantage over an initial personal
opinion. Not that initial opinions aren't useful at first to express
support or non-support. I could have just used +1. ;-)

Cheers,
Ron

Steven D'Aprano

2014-07-30 00:17:26 UTC

On Tue, Jul 29, 2014 at 06:12:16PM -0500, Ron Adam wrote on the
similarity of lists and dicts:

[...]

Post by Ron Adam
Well, here is how they currently compare.

Post by Ron Adam
set(dir(dict)).intersection(set(dir(list)))

Now strip out the methods which are common to pretty much all objects,
in other words just look at the ones which are common to mapping and
sequence APIs but not to objects in general:

{'copy', '__ge__', '__delitem__', '__getitem__', 'pop', '__gt__',
'clear', '__len__', '__le__', '__contains__', '__lt__', '__setitem__',
'__iter__'}

And now look a little more closely:

- although dicts and lists both support order comparisons like > and <,
you cannot compare a dict to a list in Python 3;

- although dicts and lists both support a pop method, their signatures
are different; x.pop() will fail if x is a dict, and x.pop(k, d) will
fail if x is a list;

- although both support membership testing "a in x", what is being
tested is rather different; if x is a dict, then a must be a key,
but the analog of keys for lists is the index, not the value.

So the similarities between list and dict are:

* both have a length

* both are iterable

* both support subscripting operations x[i]

* although dicts don't support slicing x[i:j:k]

* both support a copy() method

* both support a clear() method

That's not a really big set of operations in common, and they're rather
general.

The real test is, under what practical circumstances would you expect to
freely substitute a list for a dict or visa versa, and what could you do
with that object when you received it?

For me, the only answer that comes readily to mind is that the dict
constructor accepts either another dict or a list of (key,item) pairs.

[...]

Post by Ron Adam
They do have quite a lot in common already. The usefulness of different
types having the same methods is that external code can be less specific to
the objects they handle.

I don't think that it is reasonable to treat dicts and lists as having a
lot in common. They have a little in common, by virtue of both being
containers, but then a string bag and a 40ft steel shipping container
are both containers too, so that doesn't imply much similarity :-) It
seems to me that outside of utterly generic operations like iteration,
conversion to string and so on, lists do not quack like dicts, and dicts
do not swim like lists, in any significant sense.

--
Steven

Ron Adam

2014-07-30 14:27:00 UTC

Post by Steven D'Aprano
On Tue, Jul 29, 2014 at 06:12:16PM -0500, Ron Adam wrote on the
[...]

Post by Ron Adam
Well, here is how they currently compare.

Post by Ron Adam
set(dir(dict)).intersection(set(dir(list)))

Now strip out the methods which are common to pretty much all objects,
in other words just look at the ones which are common to mapping and
{'copy', '__ge__', '__delitem__', '__getitem__', 'pop', '__gt__',
'clear', '__len__', '__le__', '__contains__', '__lt__', '__setitem__',
'__iter__'}
- although dicts and lists both support order comparisons like > and <,
you cannot compare a dict to a list in Python 3;

I think this would be the case we are describing with + and +=. You would
not be able to add a dict and some other incompatible type.

Cheers,
Ron

Ryan Hiebert

2014-07-28 18:37:03 UTC

Post by Steven D'Aprano
I'm still inclined to prefer allowing update() to accept multiple
a.update(b, c, d)
rather than a += b + c + d
which suggests that maybe there ought to be an updated() built-in, Let
the bike-shedding begin: should such a thing be spelled ?
new_dict = a + b + c + d

or, to match set

new_dict = a | b | c | d

Nick Coghlan

2014-07-28 22:20:53 UTC

Post by Steven D'Aprano
I'm still inclined to prefer allowing update() to accept multiple
a.update(b, c, d)
rather than a += b + c + d

Note that if update() was changed to accept multiple args, the dict()
constructor could similarly be updated.

Then:

x = dict(a)
x.update(b)
x.update(c)
x.update(d)

Would become:

x = dict(a, b, c, d)

Aside from the general "What's the use case that wouldn't be better served
by a larger scale refactoring?" concern, my main issue with that approach
would be the asymmetry it would introduce with the set constructor (which
disallows multiple arguments to avoid ambiguity in the single argument
case).

But really, I'm not seeing a compelling argument for why this needs to be a
builtin. If someone is merging dicts often enough to care, they can already
write a function to do the dict copy-and-update as a single operation. What
makes this more special than the multitude of other three line functions in
the world?

Cheers,
Nick.

Alexander Heger

2014-07-28 22:48:49 UTC

Post by Nick Coghlan
But really, I'm not seeing a compelling argument for why this needs to be a
builtin. If someone is merging dicts often enough to care, they can already
write a function to do the dict copy-and-update as a single operation. What
makes this more special than the multitude of other three line functions in
the world?

We all have too many of those.

This would not add too much complexity to the language and overcome
some awkward constructs needed otherwise.
Currently dictionaries are not really as easy to use as your everyday
data type as it should be lacking such operators.

-Alexander

Guido van Rossum

2014-07-28 16:08:49 UTC

In addition, dict(A, **B) is not something you easily stumble upon when
your goal is "merge two dicts"; nor is it even clear that that's what it is
when you read it for the first time.

All signs of too-clever hacks in my book.

Post by dw+

Post by Joshua Landau
One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

It worked in Python 2, but Python 3 added code to explicitly prevent the
kwargs mechanism from being abused by passing non-string keys.
Effectively, the only reason it worked was due to a Python 2.x kwargs
implementation detail.
It took me a while to come to terms with this one too, it was really
quite a nice hack. But that's all it ever was. The domain of valid keys
accepted by **kwargs should never have exceeded the range supported by
the language syntax for declaring keyword arguments.
David
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

--
--Guido van Rossum (python.org/~guido)

Antoine Pitrou

2014-07-28 17:29:00 UTC

Post by Guido van Rossum
In addition, dict(A, **B) is not something you easily stumble upon when
your goal is "merge two dicts"; nor is it even clear that that's what it
is when you read it for the first time.
All signs of too-clever hacks in my book.

Agreed with Guido (!).

Regards

Antoine.

Alexander Heger

2014-07-28 20:59:29 UTC

In addition, dict(A, **B) is not something you easily stumble upon when your
goal is "merge two dicts"; nor is it even clear that that's what it is when
you read it for the first time.
All signs of too-clever hacks in my book.

I try to convince students to learn and *use* python.

If I tell students to merge 2 dictionaries they have to do dict(A,
**B} or {**A, **B} that seem less clear (not something you "stumble
across" as Guidon says) than A + B; then we still have to tell them
the rules of the operation, as usual for any operation.

It does not have to be "+", could be the "union" operator "|" that is
used for sets where
s.update(t)
is the same as
s |= t

... and accordingly

D = A | B | C

Maybe this operator is better as this equivalence is already being
used (for sets). Accordingly "union(A,B)" could do a merge operation
and return the new dict().

(this then still allows people who want "+" to add the values be made
happy in the long run)

-Alexander

Andrew Barnert

2014-07-28 22:17:22 UTC

In addition, dict(A, **B) is not something you easily stumble upon when your
goal is "merge two dicts"; nor is it even clear that that's what it is when
you read it for the first time.
All signs of too-clever hacks in my book.

The difference is that with sets, it (at least conceptually) doesn't matter whether you keep elements from s or t when they collide, because by definition they only collide if they're equal, but with dicts, it very much matters whether you keep items from s or t when their keys collide, because the corresponding values are generally _not_ equal. So this is a false analogy; the same problem raised in the first three replies on this thread still needs to be answered: Is it obvious that the values from b should overwrite the values from a (assuming that's the rule you're suggesting, since you didn't specify; translate to the appropriate question if you want a different rule) in all real-life use cases? If not, is this so useful that the benefits in some uses outweigh the almost certain confus
ion in others? Without a compelling "yes" to one of those two questions, we're still at square one here; switching from + to | and making an analogy with sets doesn't help.

Post by Alexander Heger
... and accordingly
D = A | B | C
Maybe this operator is better as this equivalence is already being
used (for sets). Accordingly "union(A,B)" could do a merge operation
and return the new dict().

Wouldn't you expect a top-level union function to take any two iterables and return the union of them as a set (especially given that set.union accepts any iterable for its non-self argument)? A.union(B) seems a lot better than union(A, B).

Then again, A.updated(B) or updated?A, B) might be even better, as someone suggested, because the parallel between update and updated (and between e.g. sort and sorted) is not at all problematic.

Post by Alexander Heger
(this then still allows people who want "+" to add the values be made
happy in the long run)
-Alexander
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Alexander Heger

2014-07-28 22:35:55 UTC

Post by Andrew Barnert
The difference is that with sets, it (at least conceptually) doesn't matter whether you keep elements from s or t when they collide, because by definition they only collide if they're equal, but with dicts, it very much matters whether you keep items from s or t when their keys collide, because the corresponding values are generally _not_ equal. So this is a false analogy; the same problem raised in the first three replies on this thread still needs to be answered: Is it obvious that the values from b should overwrite the values from a (assuming that's the rule you're suggesting, since you didn't specify; translate to the appropriate question if you want a different rule) in all real-life use cases? If not, is this so useful that the benefits in some uses outweigh the almost certain conf

usion in others? Without a compelling "yes" to one of those two questions, we're still at square one here; switching from + to | and making an analogy with sets doesn't help.

Post by Andrew Barnert

Wouldn't you expect a top-level union function to take any two iterables and return the union of them as a set (especially given that set.union accepts any iterable for its non-self argument)? A.union(B) seems a lot better than union(A, B).
Then again, A.updated(B) or updated?A, B) might be even better, as someone suggested, because the parallel between update and updated (and between e.g. sort and sorted) is not at all problematic.

yes, one does have to deal with collisions and spell out a clear rule:
same behaviour as update().

I was less uneasy about the | operator
1) it is already used the same way for collections.Counter [this is a
quite strong constraint]
2) in shells it is used as "pipe" implying directionality - order matters

yes, you are wondering whether the order should be this or that; you
just *define* what it is, same as you do for subtraction.

Another way of looking at it is to say that even in sets you take the
second, but because they are identical it does not matter ;-)

-Alexander

Guido van Rossum

2014-07-28 15:40:17 UTC

I'll regret jumping in here, but while dict(A, **B) as a way to merge two
dicts A and B makes some sense, it has two drawbacks: (1) slow (creates an
extra copy of B as it creates the keyword args structure for dict()) and
(2) not general enough (doesn't support key types other than str).

Post by Steven D'Aprano
[...]

Post by Joshua Landau

Post by Steven D'Aprano
Is there a good reason for implementing the + operator as
dict.update?

One good reason is that people are still convinced "dict(A, **B)"
makes some kind of sense.

Explain please. dict(A, **B) makes perfect sense to me, and it works
perfectly too. It's a normal constructor call, using the same syntax as
any other function or method call. Are you suggesting that it does not
make sense?
--
Steven
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

--
--Guido van Rossum (python.org/~guido)

Terry Reedy

2014-07-27 01:27:04 UTC

Post by Alexander Heger
Is there a good reason for not implementing the "+" operator for dict.update()?

As you immediate noticed, this is an incoherent request as stated. A op
B should be a new object.

Post by Alexander Heger
A = dict(a=1, b=1)
B = dict(a=2, c=2)
B += A

Since "B op= A" is *defined* as resulting in B having the value of "B op
A", with the operations possibly being done in-place if B is mutable, we
would first have to define addition on dicts.

Post by Alexander Heger
B
dict(a=1, b=1, c=2)
That is
B += A
should be equivalent to
B.update(A)
It would be even better if there was also a regular "addition"
operator that is equivalent to creating a shallow copy and then

You have this backwards. Dict addition would have to come first, and
there are multiple possible and contextually useful definitions. The
idea of choosing anyone of them as '+' has been rejected.

As indicated, augmented dict addition would follow from the choice of
dict addition. It would not necessarily be equivalent to .update. The
addition needed to make this true would be asymmetric, like catenation.

But unlike sequence catenation, information is erased in that items in
the updated dict get subtracted. Conceptually, update is replacement
rather than just addition.

Post by Alexander Heger
My apologies if this has been posted

Multiple dict additions have been proposed and discussed here on
python-ideas and probably on python-list.

--
Terry Jan Reedy

Alexander Heger

2014-07-27 02:18:48 UTC

Dear Terry,

As you immediate noticed, this is an incoherent request as stated. A op B
should be a new object.
[...]
You have this backwards. Dict addition would have to come first, and there
are multiple possible and contextually useful definitions. The idea of
choosing anyone of them as '+' has been rejected.

I had set out wanting to have a short form for dict.update(), hence
the apparently reversed order.
The proposed full addition does the same after first making a shallow
copy; the operator interface does define both __iadd__ and __add__.

As indicated, augmented dict addition would follow from the choice of dict
addition. It would not necessarily be equivalent to .update. The addition
needed to make this true would be asymmetric, like catenation.

yes. As I note, most uses of the "+" operator in Python are not
symmetric (commutative).

But unlike sequence catenation, information is erased in that items in the
updated dict get subtracted. Conceptually, update is replacement rather than
just addition.

Yes., not being able to have multiple identical keys is the nature of
dictionaries.
This does not mean that things should not be done in the best way they
can be done.
I was considering the set union operator "|" but that is also
symmetric and may cause more confusion.

Another consideration suggested was the element-wise addition in some form.
This is the natural way of doing things for structures of fixed length
like arrays, including numpy arrays.
And this is being accepted.
In contrast, for data structures with variable length, like lists and
strings, "addition" is concatenation, and what I would see the most
natural extension for dictionaries hence is to add the keys (not the
key values or values to each other), with the common behavior to
overwrite existing keys. You do have the choice in which order you
write the operation.

It would be funny if addition of strings would add their ASCII, char,
or unicode values and return the resulting string.

Sorry for bringing up, again, the old discussion of how to add
dictionaries as part of this.

-Alexander

Post by Alexander Heger
Is there a good reason for not implementing the "+" operator for dict.update()?

As you immediate noticed, this is an incoherent request as stated. A op B
should be a new object.

Post by Alexander Heger
A = dict(a=1, b=1)
B = dict(a=2, c=2)
B += A

Since "B op= A" is *defined* as resulting in B having the value of "B op A",
with the operations possibly being done in-place if B is mutable, we would
first have to define addition on dicts.

You have this backwards. Dict addition would have to come first, and there
are multiple possible and contextually useful definitions. The idea of
choosing anyone of them as '+' has been rejected.
As indicated, augmented dict addition would follow from the choice of dict
addition. It would not necessarily be equivalent to .update. The addition
needed to make this true would be asymmetric, like catenation.
But unlike sequence catenation, information is erased in that items in the
updated dict get subtracted. Conceptually, update is replacement rather than
just addition.

Post by Alexander Heger
My apologies if this has been posted

Multiple dict additions have been proposed and discussed here on
python-ideas and probably on python-list.
--
Terry Jan Reedy
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Nathan Schneider

2014-07-28 18:58:21 UTC

Post by Alexander Heger
My apologies if this has been posted before but with a quick google
search I could not see it; if it was, could you please point me to the
thread?

Here are two threads that had some discussion of this:
https://mail.python.org/pipermail/python-ideas/2011-December/013227.html
and https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.

Seems like a useful feature if there could be a clean way to spell it.

Cheers,
Nathan

Paul Moore

2014-07-28 19:21:54 UTC

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2011-December/013227.html

This doesn't seem to have a use case, other than "it would be nice".

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.

This can be handled using ChainMap, if I understand the proposal.

Post by Nathan Schneider
Seems like a useful feature if there could be a clean way to spell it.

I've yet to see any real-world situation when I've wanted "dictionary
addition" (with any of the various semantics proposed here) and I've
never encountered a situation where using d1.update(d2) was
sufficiently awkward that having an operator seemed reasonable.

In all honesty, I'd suggest that code which looks bad enough to
warrant even considering this feature is probably badly in need of
refactoring, at which point the problem will likely go away.

Paul

Andrew Barnert

2014-07-28 20:20:20 UTC

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2011-December/013227.html

This doesn't seem to have a use case, other than "it would be nice".

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.

This can be handled using ChainMap, if I understand the proposal.

When the underlying dicts and desired combined dict are all going to be used immutably, ChainMap is the perfect answer. (Better than an "updated" function for performance if nothing else.) And usually, when you're looking for a non-mutating combine-dicts operation, that will be what you want.

But usually isn't always. If you want a snapshot of the combination of mutable dicts, ChainMap is wrong. If you want to be able to mutate the result, ChainMap is wrong.

All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.)

And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.

Post by Nathan Schneider
Seems like a useful feature if there could be a clean way to spell it.

I've yet to see any real-world situation when I've wanted "dictionary
addition" (with any of the various semantics proposed here) and I've
never encountered a situation where using d1.update(d2) was
sufficiently awkward that having an operator seemed reasonable.
In all honesty, I'd suggest that code which looks bad enough to
warrant even considering this feature is probably badly in need of
refactoring, at which point the problem will likely go away.
Paul
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Petr Viktorin

2014-07-28 20:53:43 UTC

On Mon, Jul 28, 2014 at 10:20 PM, Andrew Barnert

In those cases, do dict(ChainMap(...)).

Post by Andrew Barnert
All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.)
And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.

Alexander Heger

2014-07-28 22:21:09 UTC

Post by Andrew Barnert
When the underlying dicts and desired combined dict are all going to be used immutably, ChainMap is the perfect answer. (Better than an "updated" function for performance if nothing else.) And usually, when you're looking for a non-mutating combine-dicts operation, that will be what you want.
But usually isn't always. If you want a snapshot of the combination of mutable dicts, ChainMap is wrong. If you want to be able to mutate the result, ChainMap is wrong.
All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.)
And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.

For many applications you may not care one way or the other, only for
some you do, and only then you need to know the details of operation.

My point is to make the dict() data structure more easy to use for
most users and use cases. Especially novices.
This is what adds power to the language. Not that you can do things
(Turing machines can) but that you can do them easily and naturally.

Nick Coghlan

2014-07-28 22:40:02 UTC

Post by Alexander Heger
My point is to make the dict() data structure more easy to use for
most users and use cases. Especially novices.
This is what adds power to the language. Not that you can do things
(Turing machines can) but that you can do them easily and naturally.

But why is dict merging into a *new* dict something that needs to be done
as a single expression? What's the problem with spelling out "to merge two
dicts into a new, first make a dict, then merge in the other one":

x = dict(a)
x.update(b)

That's the real competitor here, not the more cryptic "x = dict(a, **b)"

You can even use it as an example of factoring out a helper function:

def copy_and_update(a, *args):
x = dict(a)
for arg in args:
x.update(arg)
return x

My personal experience suggests that's a rare enough use case that it's
fine to leave it as a trivial helper function that people can write if they
need it. The teaching example isn't compelling, since in the teaching case,
spelling out the steps is going to be necessary anyway to explain what the
function or method call is actually doing.

Cheers,
Nick.

Post by Alexander Heger
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Alexander Heger

2014-07-28 23:45:06 UTC

But why is dict merging into a *new* dict something that needs to be done as
a single expression? What's the problem with spelling out "to merge two
x = dict(a)
x.update(b)
That's the real competitor here, not the more cryptic "x = dict(a, **b)"
x = dict(a)
x.update(arg)
return x
My personal experience suggests that's a rare enough use case that it's fine
to leave it as a trivial helper function that people can write if they need
it. The teaching example isn't compelling, since in the teaching case,
spelling out the steps is going to be necessary anyway to explain what the
function or method call is actually doing.

it is more about having easy operations for people who learn Python
for the sake of using it (besides, I teach science students not
computer science students).

The point is that it could be done in one operation. It seems like
asking people to write

a = 2 + 3

as

a = int(2)
a.add(3)

Turing machine vs modern programming language.

It does already work for Counters.

The discussion seems to go such that because people can't agree
whether the first or second occurrence of keys takes precedence, or
what operator to use (already decided by the design of Counter) it is
not done at all. To be fair, I am not a core Python programmer and am
asking others to implement this - or maybe even agree it would be
useful -, maybe pushing too much where just an idea should be floated.

-Alexander

Andrew Barnert

2014-07-29 02:09:31 UTC

Post by Alexander Heger
The discussion seems to go such that because people can't agree
whether the first or second occurrence of keys takes precedence, or
what operator to use (already decided by the design of Counter) it is
not done at all.

Well, yeah, that happens a lot. An good idea that can't be turned into a concrete design that fits the language and makes everyone happy doesn't get added, unless it's so ridiculously compelling that nobody can imagine living without it.

But that's not necessarily a bad thing--it's why Python is a relatively small and highly consistent language, which I think is a big part of why Python is so readable and teachable.

Anyway, I think you're on to something with your idea of adding an updated or union or whatever function/method whose semantics are obvious, and then mapping the operators to that method and update. I can definitely buy that a.updated(b) or union(a, b) favors values from b for exactly the same reason a.update(b) does (although as I mentioned I have other problems with a union function).

Meanwhile, if you have use cases for which ChainMap is not appropriate, you might want to write a dict subclass that you can use in your code or in teaching students or whatever, so you can amass some concrete use cases and show how much cleaner it is than the existing alternatives.

Post by Alexander Heger
To be fair, I am not a core Python programmer and am
asking others to implement this - or maybe even agree it would be
useful -, maybe pushing too much where just an idea should be floated.

If it helps, if you can get everyone to agree on this, except that none of the core devs wants to do the work, I'll volunteer to write the C code (after I finish my io patch and my abc patch...), so you only have to add the test cases (which are easy Python code; the only hard part is deciding what to test) and the docs.

Alexander Heger

2014-07-28 22:15:49 UTC

Post by Paul Moore
In all honesty, I'd suggest that code which looks bad enough to
warrant even considering this feature is probably badly in need of
refactoring, at which point the problem will likely go away.

I often want to call functions with added (or removed, replaced)
keywords from the call.

args0 = dict(...)
args1 = dict(...)

def f(**kwargs):
g(**(arg0 | kwargs | args1))

currently I have to write

args = dict(...)
def f(**kwargs):
temp_args = dict(dic0)
temp_args.update(kwargs)
temp_args.update(dic1)
g(**temp_args)

It would also make the proposed feature to allow multiple kw args
expansions in Python 3.5 easy to write by having

f(**a, **b, **c)
be equivalent to
f(**(a | b | c))

-Alexander

Andrew Barnert

2014-07-28 22:19:22 UTC

I often want to call functions with added (or removed, replaced)
keywords from the call.
args0 = dict(...)
args1 = dict(...)
g(**(arg0 | kwargs | args1))
currently I have to write
args = dict(...)
temp_args = dict(dic0)
temp_args.update(kwargs)
temp_args.update(dic1)
g(**temp_args)

No, you just have to write a one-liner with ChainMap, except in the (very rare) case where you're expecting g to hold onto and later modify its kwargs.

Post by Alexander Heger
It would also make the proposed feature to allow multiple kw args
expansions in Python 3.5 easy to write by having
f(**a, **b, **c)
be equivalent to
f(**(a | b | c))
-Alexander
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Alexander Heger

2014-07-28 23:04:42 UTC

Post by Andrew Barnert

Post by Alexander Heger
args = dict(...)
temp_args = dict(dic0)
temp_args.update(kwargs)
temp_args.update(dic1)
g(**temp_args)

No, you just have to write a one-liner with ChainMap, except in the (very rare) case where you're expecting g to hold onto and later modify its kwargs.

yes, this (modify) is what I do.

In any case, it would still be

g(**collections.ChainMap(dict1, kwargs, dic0))

In either case a new dict is created and passed to g as kwargs.

It's not pretty, but it does work. Thanks.

so the general case

D = A | B | C

becomes

D = dict(collections.ChainMap(C, B, A))

(someone may suggest dict could have a "chain" constructor class
method D = dict.chain(C, B, A))

Paul Moore

2014-07-29 06:22:34 UTC

Post by Alexander Heger
D = A | B | C
becomes
D = dict(collections.ChainMap(C, B, A))

This immediately explains the key problem with this proposal. It never
even *occurred* to me that anyone would expect C to take priority over
A in the operator form. But the ChainMap form makes it immediately
clear to me that this is the intent.

An operator form will be nothing but a maintenance nightmare and a
source of bugs. Thanks for making this obvious :-)

-1.

Paul

Jonas Wielicki

2014-07-29 11:56:57 UTC

Post by Alexander Heger
D = A | B | C
becomes
D = dict(collections.ChainMap(C, B, A))

FWIW, one could use an operator which inherently shows a direction: <<
and >>, for both directions respectively.

A = B >> C lets B take precedence, and A = B << C lets C take precedence.

regards,
jwi

p.s.: I’m not entirely sure what to think about my suggestion---I’d like
to hear opinions.

Post by Paul Moore
An operator form will be nothing but a maintenance nightmare and a
source of bugs. Thanks for making this obvious :-)
-1.
Paul
_______________________________________________
Python-ideas mailing list
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Paul Moore

2014-07-29 12:29:52 UTC

Post by Jonas Wielicki
FWIW, one could use an operator which inherently shows a direction: <<
and >>, for both directions respectively.
A = B >> C lets B take precedence, and A = B << C lets C take precedence.
regards,
jwi
p.s.: I’m not entirely sure what to think about my suggestion---I’d like
to hear opinions.

Personally, I don't like it much more than the symmetric-looking
operators. I get your point, but it feels like you're just patching
over a relatively small aspect of a fundamentally bad idea. But then
again as I've already said, I see no need for any of this, the
existing functionality seems fine to me.

Paul

Nathan Schneider

2014-07-29 14:50:02 UTC

If there is to be an operator devoted specifically to this, I like << and
https://mail.python.org/pipermail/python-ideas/2011-December/013232.html :)

I am also partial to the {**A, **B} proposal in
http://legacy.python.org/dev/peps/pep-0448/.

Cheers,
Nathan

Greg Ewing

2014-07-29 22:46:46 UTC

While it succeeds in indicating a direction, it
fails to suggest any kind of addition or union.

--
Greg

Jonas Wielicki

2014-07-30 08:37:23 UTC

Post by Greg Ewing

While it succeeds in indicating a direction, it
fails to suggest any kind of addition or union.

As already noted elsewhere (to continue playing devils advocate), its
not an addition or union anyways. It’s not a union because it is lossy
and not commutative it’s not something I’d call addition either.

While one can certainly see it as shifting the elements from dict A over
dict B.

regards,
jwi

Steven D'Aprano

2014-07-29 14:36:05 UTC

Post by Alexander Heger
D = A | B | C
becomes
D = dict(collections.ChainMap(C, B, A))

Hmmm. Funny you say that, because to me that is a major disadvantage of
the ChainMap form: you have to write the arguments in reverse order.

Suppose that we want to start with a, then override it with b, then
override that with c. Since a is the start (the root, the base), we
start with a, something like this:

d = {}
d.update(a)
d.update(b)
d.update(c)

If update was chainable as it would be in Ruby:

d.update(a).update(b).update(c)

or even:

d.update(a, b, c)

This nicely leads us to d = a+b+c (assuming we agree that + meaning
merge is the spelling we want).

The ChainMap, on the other hand, works backwards from this perspective:
the last dict to be merged has to be given first:

ChainMap(c, b, a)

--
Steven

Andrew Barnert

2014-07-29 19:29:29 UTC

Post by Alexander Heger
D = A | B | C
becomes
D = dict(collections.ChainMap(C, B, A))

Hmmm. Funny you say that, because to me that is a major disadvantage of
the ChainMap form: you have to write the arguments in reverse order.

I think that's pretty much exactly his point:

To him, it's obvious that + should be in the order of ChainMap, and he can't even conceive of the possibility that you'd want it "backward".

To you, it's obvious that + should be the other way around, and you find it annoying that ChainMap is "backward".

Which seems to imply that any attempt at setting an order is going to not only seem backward, but possibly surprisingly so, to a subset of Python's users.

And this is the kind of thing can lead to subtle bugs. If a and b _almost never_ have duplicate keys, but very rarely do, you won't catch the problem until you think to test for it. And if one order or the other is so obvious to you that you didn't even imagine anyone would ever implement the opposite order, you probably won't think to write the test until you have a bug in the field…

Nick Coghlan

2014-07-30 11:52:54 UTC

Post by Andrew Barnert

Post by Alexander Heger
D = A | B | C
becomes
D = dict(collections.ChainMap(C, B, A))

Hmmm. Funny you say that, because to me that is a major disadvantage of
the ChainMap form: you have to write the arguments in reverse order.

To him, it's obvious that + should be in the order of ChainMap, and he can't even conceive of the possibility that you'd want it "backward".
To you, it's obvious that + should be the other way around, and you find it annoying that ChainMap is "backward".
Which seems to imply that any attempt at setting an order is going to not only seem backward, but possibly surprisingly so, to a subset of Python's users.
And this is the kind of thing can lead to subtle bugs. If a and b _almost never_ have duplicate keys, but very rarely do, you won't catch the problem until you think to test for it. And if one order or the other is so obvious to you that you didn't even imagine anyone would ever implement the opposite order, you probably won't think to write the test until you have a bug in the field…

I think this is a nice way of explaining the concern.

I'll also note that, given we turned a whole pile of similarly subtle
data driven bugs into structural type errors in the Python 3
transition, I'm not exactly enamoured of the idea of adding more :)

Cheers,
Nick.

--
Nick Coghlan | ***@gmail.com | Brisbane, Australia

Nick Coghlan

2014-07-28 22:27:06 UTC

The first part of this one of the use cases for functools.partial(), so it
isn't a compelling argument for easy dict merging. The above is largely an
awkward way of spelling:

import functools
f = functools.partial(g, **...)

The one difference is to also silently *override* some of the explicitly
passed arguments, but that part's downright user hostile and shouldn't be
encouraged.

Regards,
Nick.

Alexander Heger

2014-07-28 23:18:37 UTC

Post by Nick Coghlan

Post by Alexander Heger
args0 = dict(...)
args1 = dict(...)
g(**(arg0 | kwargs | args1))
currently I have to write
args = dict(...)
temp_args = dict(dic0)
temp_args.update(kwargs)
temp_args.update(dic1)
g(**temp_args)

The first part of this one of the use cases for functools.partial(), so it
isn't a compelling argument for easy dict merging. The above is largely an
import functools
f = functools.partial(g, **...)
The one difference is to also silently *override* some of the explicitly
passed arguments, but that part's downright user hostile and shouldn't be
encouraged.

yes, poor example due to briefly. ;-)

In my case f would actually do something with the values of kwargs
before calling g, and args1 many not be static outside f.
(hence partial is not a solution for the full application)

def f(**kwargs):
# do something with kwrags, create dict0 and dict1 using kwargs
temp_args = dict(dict0)
temp_args.update(kwargs)
temp_args.update(dict1)
g(**temp_args)
# more uses of dict0

which could be

def f(**kwargs):
# do something with kwargs, create dict0 and dict1 using kwargs
g(**collections.ChainMap(dict1, kwargs, dict0))
# more uses of dict0

Maybe good enough for that case, like with + or |, one still need to
know/learn the lookup order for key replacement, and it is sort of
bulky.

-Alexander

Alexander Heger

2014-07-28 22:20:53 UTC

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.

I see, this is a very extended thread google did not show me when I
started this one, and many good points were made there.
So, my apologies I restarted this w/o reference; this discussion does
seem to resurface, however.

It seems it would be valuable to parallel the behaviour of operators
already in place for collections. Counter:

A + B adds values (calls __add__ or __iadd__ function of values,
likely __iadd__ for values of A)
A |= B does A.update(B)
etc.

-Alexander

Post by Nathan Schneider

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2011-December/013227.html

This doesn't seem to have a use case, other than "it would be nice".

Post by Nathan Schneider
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.

This can be handled using ChainMap, if I understand the proposal.

Post by Nathan Schneider
Seems like a useful feature if there could be a clean way to spell it.

I've yet to see any real-world situation when I've wanted "dictionary
addition" (with any of the various semantics proposed here) and I've
never encountered a situation where using d1.update(d2) was
sufficiently awkward that having an operator seemed reasonable.
In all honesty, I'd suggest that code which looks bad enough to
warrant even considering this feature is probably badly in need of
refactoring, at which point the problem will likely go away.
Paul

Stephen J. Turnbull

2014-07-29 00:16:08 UTC

Post by Alexander Heger
It seems it would be valuable to parallel the behaviour of operators
already in place for collections.

Mappings aren't collections. In set theory, of course, they are
represented as *appropriately restricted* collections, but the meaning
of "+" as applied to mappings in mathematics varies. For functions on
the same domain, there's usually an element-wise meaning that's
applied. For functions on different domains, I've seen it used to
mean "apply the appropriate function on the disjoint union of the
domains".

I don't think there's an obvious winner in the competition among the
various meanings.

Alexander Heger

2014-07-29 00:38:38 UTC

Post by Alexander Heger
It seems it would be valuable to parallel the behaviour of operators
already in place for collections.

Mappings aren't collections. In set theory, of course, they are
represented as *appropriately restricted* collections, but the meaning
of "+" as applied to mappings in mathematics varies. For functions on
the same domain, there's usually an element-wise meaning that's
applied. For functions on different domains, I've seen it used to
mean "apply the appropriate function on the disjoint union of the
domains".
I don't think there's an obvious winner in the competition among the
various meanings.

I mistyped. It should have read " ... the behaviour in place for
collections.Counter"

It does define "+" and "|" operations.

-Alexander

Stephen J. Turnbull

2014-07-29 03:13:15 UTC

Post by Alexander Heger
I mistyped. It should have read " ... the behaviour in place for
collections.Counter"

But there *is* a *the* (ie, unique) "additive" behavior for Counter.
(At least, I find it reasonable to think so.) What you're missing is
that there is no such agreement on what it means to add dictionaries.

True, you can "just pick one". Python doesn't much like to do that,
though. The problem is that on discovering that dictionaries can be
added, *everybody* is going to think that their personal application
is the obvious one to implement as "+" and/or "+=". Some of them are
going to be wrong and write buggy code as a consequence.

Nick Coghlan

2014-07-29 07:46:56 UTC

Post by Alexander Heger
I mistyped. It should have read " ... the behaviour in place for
collections.Counter"

But there *is* a *the* (ie, unique) "additive" behavior for Counter.
(At least, I find it reasonable to think so.) What you're missing is
that there is no such agreement on what it means to add dictionaries.
True, you can "just pick one". Python doesn't much like to do that,
though. The problem is that on discovering that dictionaries can be
added, *everybody* is going to think that their personal application
is the obvious one to implement as "+" and/or "+=". Some of them are
going to be wrong and write buggy code as a consequence.

In fact, the existence of collections.Counter.__add__ is an argument

issubclass(collections.Counter, dict)

True

So, if someone *wants* a dict with "addable" semantics, they can
already use collections.Counter. While some of its methods really only
work with integers, the addition part is actually usable with
arbitrary addable types.

If set-like semantics were added to dict, it would conflict with the
existing element-wise semantics of Counter.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan-***@public.gmane.org | Brisbane, Australia

Terry Reedy

2014-07-29 01:39:28 UTC

Post by Alexander Heger
It seems it would be valuable to parallel the behaviour of operators
already in place for collections.

This assumes the same range set (of addable items) also. If Python were
to add d1 + d2 and d1 += d2, I think we should use this existing and
most common definition and add values. The use cases are keyed
collections of things that can be added, which are pretty common.
Then dict addition would have the properties of the value addition.

Example: Let sales be a mapping from salesperson to total sales (since
whenever). Let sales_today be a mapping from saleperson to today's
sales. Then sales = sales + sales_today, or sales += sales_today. I
could, of course, do this today with class Sales(dict): with __add__,
__iadd__, and probably other app-specific methods.

The issue is that there are two ways to update a mapping with an update
mapping: replace values and combine values. Addition combines, so to me,
dict addition, if defined, should combine.

Post by Stephen J. Turnbull
For functions on different domains, I've seen it used to
mean "apply the appropriate function on the disjoint union of the
domains".

According to https://en.wikipedia.org/wiki/Disjoint_union, d_u has at
least two meaning.

--
Terry Jan Reedy

Stephen J. Turnbull

2014-07-29 05:15:44 UTC

Post by Terry Reedy
This assumes the same range set (of addable items) also. If Python were
to add d1 + d2 and d1 += d2, I think we should use this existing and
most common definition and add values.

IMHO[1] that's way too special for the generic mapping types. If one
wants such operations, she should define NumericValuedMapping and
StringValuedMapping etc classes for each additive set of values.

Post by Terry Reedy

Post by Stephen J. Turnbull
For functions on different domains, I've seen it used to
mean "apply the appropriate function on the disjoint union of the
domains".

According to https://en.wikipedia.org/wiki/Disjoint_union, d_u has at
least two meaning.

Either meaning will do here, with the distinction that the set-
theoretic meaning (which I intended) applies to any two functions,
while the alternate meaning imposes a restriction on the functions
that can be added (and therefore is inappropriate for this discussion
IMHO).

Footnotes:
[1] I mean the "H", I'm no authority.

Devin Jeanpierre

2014-07-29 02:46:14 UTC

Received: from localhost (HELO mail.python.org) (127.0.0.1)
by albatross.python.org with SMTP; 29 Jul 2014 04:47:01 +0200
Received: from mail-qa0-x232.google.com (unknown
[IPv6:2607:f8b0:400d:c00::232])
(using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits))
(No client certificate requested)
by mail.python.org (Postfix) with ESMTPS
for <Python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>; Tue, 29 Jul 2014 04:47:01 +0200 (CEST)
Received: by mail-qa0-f50.google.com with SMTP id s7so8685726qap.23
for <Python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>; Mon, 28 Jul 2014 19:46:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
h=mime-version:in-reply-to:references:from:date:message-id:subject:to
:cc:content-type;
bh=bBpXVSLmrrZnhe3buwTMTheChTbdXZpBJ9buei7yPWg=;
b=HEW1Bx/+yw6BYcS4h2PUMsQujj7+Olnho7PWC8toK3P3sokNFSezZKmtnshcC57pB2
HXbOCYNihXVf7ju78v4DMKYxKUhG7/68Vorjy2bMdQnSEWh6QimqI/P8OPutUECYp+kh
Xq3Yg3LQLFpAmjvh2yZpcIbarzpZTrrr1z8e1ldnSTQRSzFvv8a3I9fKaHCrBzWIqYMt
TPHFRcuYFNWTSk5x0HRo5wja65ahb1ZPvQCw6NrtjJEAjMYfAbWykjnEx10mnn6MAlVR
YjeiMH9sYoK+FfZagAg8uozZ88fBVeSQ5kViFyoXeStRTeiFbktOh3/vnBq0Wxftwd8c
x73w==
X-Received: by 10.140.84.21 with SMTP id k21mr49473935qgd.70.1406602014227;
Mon, 28 Jul 2014 19:46:54 -0700 (PDT)
Received: by 10.96.88.100 with HTTP; Mon, 28 Jul 2014 19:46:14 -0700 (PDT)
In-Reply-To: <87wqaxm33r.fsf-ZyLB/HNpy7EObT2aYtLEILgKK0S2OlfheBImoM+***@public.gmane.org>
X-BeenThere: python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Discussions of speculative Python language ideas
<python-ideas.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-ideas>,
<mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=unsubscribe>
List-Archive: <http://mail.python.org/pipermail/python-ideas/>
List-Post: <mailto:python-ideas-+ZN9ApsXKcEdnm+***@public.gmane.org>
List-Help: <mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-ideas>,
<mailto:python-ideas-request-+ZN9ApsXKcEdnm+***@public.gmane.org?subject=subscribe>
Errors-To: python-ideas-bounces+gcpi-python-ideas=m.gmane.org-+ZN9ApsXKcEdnm+***@public.gmane.org
Sender: "Python-ideas"
<python-ideas-bounces+gcpi-python-ideas=m.gmane.org-+ZN9ApsXKcEdnm+***@public.gmane.org>
Archived-At: <http://permalink.gmane.org/gmane.comp.python.ideas/28467>