Masklinn
2012-09-13 13:15:03 UTC
attrgetter and itemgetter are both very useful functions, but both have
a significant pitfall if the arguments passed in are validated but not
controlled: if receiving the arguments (list of attributes, keys or
indexes) from an external source and *-applying it, if the external
source passes a sequence of one element both functions will in turn
return an element rather than a singleton (1-element tuple).
This means such code, for instance code "slicing" a matrix of some sort
to get only some columns and getting the slicing information from its
caller (in situation where extracting a single column may be perfectly
sensible) will have to implement a manual dispatch between a "manual"
getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.
slicer = (operator.itemgetter(*indices) if len(indices) > 1
else lambda ar: [ar[indices[0]])
This makes for more verbose and less straightforward code, I think it
would be useful to such situations if attrgetter and itemgetter could be
forced into always returning a tuple by way of an optional argument:
# works the same no matter what len(indices) is
slicer = operator.itemgetter(*indices, force_tuple=True)
which in the example equivalences[0] would be an override (to False) of
the `len` check (`len(items) == 1` would become `len(items) == 1 and not
force_tuple`)
The argument is backward-compatible as neither function currently
accepts any keyword argument.
Uncertainty note: whether force_tuple (or whatever its name is)
silences the error generated when len(indices) == 0, and returns
a null tuple rather than raising a TypeError.
[0] http://docs.python.org/dev/library/operator.html#operator.attrgetter
a significant pitfall if the arguments passed in are validated but not
controlled: if receiving the arguments (list of attributes, keys or
indexes) from an external source and *-applying it, if the external
source passes a sequence of one element both functions will in turn
return an element rather than a singleton (1-element tuple).
This means such code, for instance code "slicing" a matrix of some sort
to get only some columns and getting the slicing information from its
caller (in situation where extracting a single column may be perfectly
sensible) will have to implement a manual dispatch between a "manual"
getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.
slicer = (operator.itemgetter(*indices) if len(indices) > 1
else lambda ar: [ar[indices[0]])
This makes for more verbose and less straightforward code, I think it
would be useful to such situations if attrgetter and itemgetter could be
forced into always returning a tuple by way of an optional argument:
# works the same no matter what len(indices) is
slicer = operator.itemgetter(*indices, force_tuple=True)
which in the example equivalences[0] would be an override (to False) of
the `len` check (`len(items) == 1` would become `len(items) == 1 and not
force_tuple`)
The argument is backward-compatible as neither function currently
accepts any keyword argument.
Uncertainty note: whether force_tuple (or whatever its name is)
silences the error generated when len(indices) == 0, and returns
a null tuple rather than raising a TypeError.
[0] http://docs.python.org/dev/library/operator.html#operator.attrgetter