One of the neat features python < 3.6 is missing is statement evaluation is strings. In this blog post we’ll use eval, exec and compile to implement a print function with statement evaluation.
After we’re done, we’ll be able to write the following code:
name = "oded" age = 26 print("My name is: {name}, my age is: {age} and 2*3={2*3}!") # My name is oded, my age is: 26 and 2*3=6!
TL;DR: The new print statement is 1.33x slower than the regular print when evaluating statements, and 1.09x slower for regular print operations. The whole snippet can be found here.
A WORD OF WARNING: THIS IS NOT PRODUCTION READY CODE: Parsing statement and accessing parent functions is a dangerous practice * The caching key is not deterministic and might return invalid results if multiple identical format strings exist in the same scope.
First of all, since python 3, print is now a function, which means it can be replaced! This behavior has been backported to python 2.7 using the future module.
Second, we need to check the performance of our solution compared to the standard print function. The following gist takes** 0.113 seconds** to run:
from time import time import os
devnull = open(os.devnull, 'w')
name = "oded" t = time() for _ in range(50000): print("my name is {} & 2*3={}".format(name, 2*3), file=devnull) print("block took {0:.3f} seconds".format(time() - t))
Now that we have a number to compare to, we can start writing code.
eval
eval takes a string, and evaluates it. we can pass context in the form of locals and globals. The code will extract the keywords from the text, then evaluate them using eval:
import os import six from time import time from inspect import currentframe from string import Formatter
string_formatter = Formatter()
devnull = open(os.devnull, 'w')
six.moves.builtins.std_print = print
defprint(text, **kwargs): caller = currentframe().f_back keywords = {kw for _, kw, _, _ in string_formatter.parse(text) if kw}
for keyword in keywords: value = eval(keyword, caller.f_globals, caller.f_locals) text = text.replace("{{{}}}".format(keyword), str(value))
std_print(text, **kwargs)
name = "oded" t = time() for _ in range(50000): print("my name is {name} & 2*3={2*3}", file=devnull) std_print("block took {0:.3f} seconds".format(time() - t))
The eval gist took 1.612 seconds to run, which is 14.26xslower than regular print! That’s pretty expensive!
Can we do better?
exec
exec takes a block of code and evaluates it. It too can get context to run in. In many cases, printing a line is a repeating task, which means that we can cache most of the calculations we do and run the code once!
We’re going to dynamically create the code we evaluated, then feed it to exec.
We’ll also create an object that’ll hold the parsed text and code, then pull it on every print call:
from __future__ import print_function import os import sys import six from time import time from random import choice from string import Formatter from string import ascii_letters from inspect import currentframe
string_formatter = Formatter()
six.moves.builtins.std_print = print
defrandom_string(length=20): """ generate a random string of given length """ return"".join(choice(ascii_letters) for _ in range(length))
def__init__(self, text, code): self.text = text self.code = code
@classmethod deffrom_text(cls, text): keywords = {random_string(): kw for _, kw, _, _ in string_formatter.parse(text) if kw}
code = ["__keywords={}"] line_code = "__keywords[\"{kw_name}\"] = {kw_statement}" for key, value in six.iteritems(keywords): code.append(line_code.format(kw_name=key, kw_statement=value)) text = text.replace("{{{}}}".format(value), "{{{}}}".format(key))
name = "oded" t = time() for _ in range(50000): print("my name is {name} & 2*3={2*3}", file=devnull) std_print("block took {0:.3f} seconds".format(time() - t))
exec gist took 1.880 seconds, which is actually worse than eval! It’s 16.63x slower than regular print.
That means that no only we optimized the wrong thing, we added a dictionary overhead to the print call. exec and eval are extremely expensive because they parse the code string every time. Can we remove that overhead?
compile
Instead of parsing the code over and over again, we’ll parse it once and store the function code object. The following code is similar to the previous one, except from the part that compiles the code string using compile:
from __future__ import print_function import os import sys import six from time import time from random import choice from string import Formatter from string import ascii_letters from inspect import currentframe
string_formatter = Formatter()
six.moves.builtins.std_print = print
defrandom_string(length=20): """ generate a random string of given length """ return"".join(choice(ascii_letters) for _ in range(length))
def__init__(self, text, code): self.text = text self.code = code
@classmethod deffrom_text(cls, text): keywords = {random_string(): kw for _, kw, _, _ in string_formatter.parse(text) if kw}
code = ["__keywords={}"] line_code = "__keywords[\"{kw_name}\"] = {kw_statement}" for key, value in six.iteritems(keywords): code.append(line_code.format(kw_name=key, kw_statement=value)) text = text.replace("{{{}}}".format(value), "{{{}}}".format(key))
name = "oded" t = time() for _ in range(50000): print("my name is {name} & 2*3={2*3}", file=devnull) std_print("block took {0:.3f} seconds".format(time() - t))
compile took 0.192 seconds to run, or 1.75x slower than regular print. That’s a major improvement compared to our best result!
Can we do better?
optimizing compiled code
python has a neat locals optimization called €œfast locals€, which makes locals access faster than regular dictionaries. When in local scope, python keeps track of the names of variables it knows about. Each of that variable is assigned an index. That index is used in an array of python objects instead of a dictionary. More about that: here.
We’ll rewrite our TextObject.to_text function to generate code that will hint python to use fast locals:
The code will looks similar to:
defawesome_print(text, **kwargs): text = "a format with {GENERATED_NAME} and {ANOTHER_GENERATED_NAME}" std_print(text.format(GENERATED_NAME=(STATEMENT), ANOTHER_GENERATED_NAME=(ANOTHER_STATEMENT), **kwargs))
We’ll now cache the function code instead of the code object.
import os import six from time import time from random import choice from string import Formatter from string import ascii_letters from inspect import currentframe
string_formatter = Formatter()
six.moves.builtins.std_print = print
defrandom_string(length=20): """ generate a random string of given length """ return"".join(choice(ascii_letters) for _ in range(length))
code = ["def awesome_print(text, **kwargs):"] line_code = " {kw_name} = {kw_statement}" for key, value in six.iteritems(keywords): text = text.replace("{{{}}}".format(value), "{{{}}}".format(key))
code.append(" text = \"{}\"".format(text)) args = ", ".join("{0}=({1})".format(key, val) for key, val in six.iteritems(keywords))
@classmethod defcompile(cls, caller, text): # optimize for the trivial case where no formatting is done ifnot cls.get_keywords(text): return cls(code=std_print, caller=caller) obj = cls.from_text(caller, text) exec(obj.code, caller.f_locals, caller.f_locals) return cls(code=caller.f_locals.pop("awesome_print"), caller=caller)
name = "oded" t = time() for _ in range(50000): print("my name is {name} & 2*3={2*3}", file=devnull) std_print("block took {0:.3f} seconds".format(time() - t))
This time it took 0.150 seconds **to run, or **1.33x slower than regular print. You probably noticed we pass the locals twice instead of the globals. Any idea why?
Optimizations are highly appreciated. My next step would probably be byte-code rewriting. I want to use replace specific line calls to the compiled print object, so we won’t need a cache at all. Stay tuned!