python string += strategies

 

So many posts have been written about string concatenation in python.
I needed to look into this issue myself a few days ago and decided to check.

I wrote a deterministic timer, and ran the following tests:

  • concatenate a string with +=
  • concatenate bytearray string with extend
  • concatenate a list of strings with str.join
  • concatenate strings with cStringIO

Here are the results for 1000000 iterations:

 $ ITERATIONS=1000000 python profile.py 1> /dev/null  
Profiling 1000000 iterations [python 3.5.2]

block took 2.083 seconds to run [cstringio]
block took 2.031 seconds to run [str.join]
block took 1.705 seconds to run [+=]
block took 1.835 seconds to run [bytearray]

$ ITERATIONS=1000000 python profile.py 1> /dev/null
Profiling 1000000 iterations [python 2.7.12]

block took 1.715 seconds to run [cstringio]
block took 1.289 seconds to run [+=]
block took 1.623 seconds to run [str.join]
block took 1.332 seconds to run [bytearray]

They are quite surprising. The bytearray += operation speed was expected  (bytearray is mutable), but I was expecting cstringio and str.join to outperform an immutable string += operation - specifically after reading this benchmark.

What’s going on here?

Read More

Password for every occasion

Every geek talks about how strong his passwords are, how he\she read this and that article that talks about strong password generation, and why strong passwords are important.

I’ll tell you what the problem is, geek to geek, I believe in strong passwords, but I honestly don’t remember them!

Of course I can use password managers like KeepassLastPass, SplashID, or any other, but I find this approach problematic:

  • THERE’S A FILE THAT STORES MY NON-HASHED PASSWORDS. That’s insane!
  • Furthermore, I need a way to access all these passwords on the go, so I would probably use a cloud provider, which means that THERE’S A CLOUD THAT STORES MY NON-HASHED PASSWORDS. That’s really insane!

I hope you get my point. The solution? Password Chameleon.

Password Chameleon handles this issue in a really simple and clever way:

  1. Remember one master password - never save it anywhere.
  2. Surf to some random website
  3. Enter your master password & The domain name of the website
  4. Password Chameleon will generate a strong password out of both

This solution is awesome - one password that generates different, unique password for each website I surf to.

Now that I got you on board, I think you should subscribe to Have i been pwned?, and If you believe in it like I do, please donate.

By the way, If you want a command line version of Password Chameleon, I’ve taken the liberty to port it to python.

exclude grep on ps a | grep

Have you ever tried to grep for a process, and saw grep show up too? annoying, right?

$ ps aux | grep "python"  
8714 pts/2 S+ 0:00 python
8716 pts/1 S+ 0:00 grep -color python

Well, turns out Wayne Werner found a cool solution!

$ ps aux | grep "[p]ython"  
8714 pts/2 S+ 0:00 python

How does it work? By putting the brackets around the letter and quotes around the string, you search for a regex which says - “Find the character ‘p’ followed by ‘ython’”

But since you put the brackets in the pattern p is now followed by ]grep won’t show up in the results list. Why? because its text is “grep -color [p]ython” and not “grep -color python”.

forgot to sudo? use 'fuck'

We all forget to sudo when we need to:

But how many of you have fuck at your disposal?

How

bash

alias fuck='sudo $(history -p \!\!)'

zsh

alias fuck='sudo $(fc -ln -1)'

fish

alias fuck='eval sudo $history[1]'

thefuck

A command line tool that corrects the previous console command, inspired by @liamosaur‘s tweet.
It does much more then adding sudo, take a look:

Installation is as easy as:

$ pip install thefuck

And like most good things nowadays, it’s hosted on GitHub: nvbn/tehfuck

overriding builtins

I fired up python and wrote following code:

sum(range(5))  
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not callable

why did this happen? Well, I accidentally overwritten the sum builtin:

sum = 0  

The fix is quite trivial: reload the builtin :)

from __builtin__ import sum  

get first item in iterable

Honestly, this is definitely not the best idiom around, but I still find it useful.

If you ever find yourself writing the following code:

candidate = [x for x in iterable if "text" in x]  
if len(candidate) == 1:
text = candidate[0]

you can use the following method to make the code a bit cleaner:

def get_first(iterable, predicate=None, default=None):  
if not iterable:
return default

for item in iterable:
if not predicate:
return item
if predicate(item):
return item
return default

text = get_first(iterable, lambda x: "text" in x)

Hettinger's iterator-with-sentinel

I just came across a beautiful technique to transform a function call to an iterator - Hettinger’s iterator-with-sentinel idiom.

most developers will probably write the following lines of code (including myself):

with open("file.data") as f:  
while True:
char = f.read(1)
if not char:
break
else:
# ...

A different, more elegant approach which uses Hettinger’s iterator-with-sentinel idiom:

from functools import partial

with open("file.data") as f:
for char in iter(partial(f.read, 1), ""):
# ...
  • The with-statement opens the file and unconditionally closes it when you’re finished.
  • The usual way to read one character is f.read(1).
  • The partial creates a function of zero arguments by always calling f.read with an argument of 1.
  • The two argument form of iter() creates an iterator that loops until you see the empty-string end-of-file marker

checkout this answer on stackoverflow that uses this idiom to decompress a gzip stream.