What I Learned From Rolling Out A Career Leveling Framework

I recently rolled out a career leveling framework at Upwave, the company where I’m currently the Director of Engineering. It was my first time creating or releasing this type of framework and it was definitely an informative experience.

I want to share some specific tactical lessons I learned from the process. The overall goals and design of the framework itself are a large, deep, only semi-related topic so for brevity I’ll mostly skip that and focus just on the lessons I learned from the roll out itself.

If you’re interested in seeing our final framework template, you can find it here: Upwave 2021 Career Growth Framework. And if you’re interested in diving into the design of the framework itself then let me know and maybe I’ll write something more in-depth.


The rollout took about a month, but the conception and creation of the framework took nearly two years. I started thinking about career leveling about a year after I first started managing Upwave’s data team (probably right after completing a set of “semi-annual reviews”, which at the time we did ad hoc and on individual team members’ semi-anniversaries).

Inspired by frameworks like Rent the Runway’s and CircleCI’s, I put together a draft leveling matrix. You’ve probably seen the type, with career levels on the columns and attributes like “reliability” or “collaboration” on the rows.

I trial-ran this early draft matrix on one of my direct reports at the time and it didn’t go well. I presented my line-by-line ratings to him and he objected vigorously to several of them. Overall, the conversation ended feeling more contentious than productive. At the time, we lacked any clear pressing need to have a leveling framework at all and so I chose to put the framework on the shelf rather than revise it or barrel ahead.

Time passed and I eventually Upwave’s Director of Engineering, and thus I became responsible for our whole engineering team. Our team spans the full spectrum of experience levels (from ~2 years to ~20+ years) and the full set of software subspecialties (web, devops, and data engineering) of a typical SaaS startup.

While I have always given constructive feedback to all of our engineers on a regular basis, as the team has grown it has become increasingly challenging to make sure that the feedback I’m giving fits into a cohesive narrative about how each engineer should be investing their efforts to advance their careers.

That’s exactly the type of problem that a career framework can help with, and so over the last ~4 months I dusted off my first draft and started using spare slices of time between meetings to refine it.

Come January, my CTO and I agreed that it was finally time to roll it out.

The Roll Out

We have a biweekly Product Development All-Hands, so I used that meeting to deliver a ~15 minute presentation announcing that the framework (and an initial leveling-setting cycle) was coming and explaining how it would work.

I then created a copy of the matrix spreadsheet (from a template in Google sheets) for each engineer and gave them ~2 weeks to fill out a first draft on the sheet (i.e. by giving themselves a rating on each attribute).

I then asked them to schedule a live 1-hour meeting with me to talk through their self-assessment and for me to share any adjustments I needed to make (along with my reasoning). The final set of ratings implied a final “score” that determined each engineers’ level.

I was able to complete 95% of these conversations within ~2 weeks, with one of them getting delayed another 2 weeks by scheduling conflicts. I finished that last one last Friday.

Overall, I’m quite pleased with how everything went. The conversations were by-and-large amicable and productive, with none of the engineers expressing substantial surprise or dismay at the outcomes (even in cases where I had to down-rate them on several attributes, which in some cases suggested a lower overall level than their self-assessment would have suggested).

It’s too early to draw meaningful conclusions about whether the framework will actually accomplish my primary goals of:

  1. ensuring that we recognize and compensate our engineers as fairly as possible, based on performance and not on characteristics irrelevant to the job.
  2. giving each engineer the clearest guidance we can about what specifically they’ll need to improve in order to achieve the next level of impact, recognition, and rewards.

But even in the several weeks since I completed the earliest of the conversations, I’ve already seen what looks to me like a marked improvement within areas that certain engineers and I together identified as primary opportunities for growth.

Lessons Learned:

1. Rolling out a leveling framework can come as a big, anxiety-provoking surprise

In retrospect this should have been obvious, but since leveling frameworks directly impact peoples’ career trajectories they’re understandably perceived as being high-stakes. They also carry a lot of baggage based on how every other job your engineers have had have handled evaluations, titles, promotions, etc.

As soon as you start talking about titles and levels, people’s minds will go back to their past bad experiences and the onus will be on you to assure them that you are not about to repeat all their past managers’ mistakes.

I made the mistake of announcing the leveling framework during a meeting that already had a packed agenda. There were quite a few questions following my presentation and I took the time to answer them, but that ended up forcing the next presenter to compress his material (an obvious faux pas) and also meant that the Q&A period felt rushed and didn’t afford us time for the quieter engineers to reflect and formulate questions.

I don’t think there’s any great way to announce this kind of thing without provoking any anxiety. But two things I would have done differently are:

  1. Make sure to present the system in a meeting with plenty of time for discussion (and to call a separate designated meeting if I couldn’t fit it neatly into an pre-scheduled one).
  2. Telegraph the system in an initial meeting, saying something like “I wanted to give you all a heads up that I’ve been thinking about a career leveling framework. It’s still in beta mode but I hope to be ready to share my details in 2 weeks. Please let me know in our next 1-1 if you have any thoughts you’d like to share, e.g. about how growth/leveling was handled well or poorly at past companies.” And then listen to input, adjust as necessary, and give a follow up presentation sharing the details of the system when they’re ready.

2. Changing the Framework Post-Release is Hard

Writing a career framework from scratch is a lot of work. There are a lot of details that go into making a complete, usable framework. And it’s easy to spend a nearly unlimited amount of time endlessly tweaking e.g. the descriptions of what it means to be a “level 4 debugger”. Unfortunately, unlike software where you can (and usually should) ship an MVP and then iterate, leveling frameworks are hard to iterate on once they’ve been released.

There are some silly reasons why making changes is hard, e.g. that Google sheets has bugs that make it kind of a nightmare to maintain formatting consistency between copies of spreadsheets.

Then there are very good reasons why making changes is hard, e.g. that if you change the definition of a Level 4 debugger, someone who was previously a Level 4 debugger might now suddenly be a Level 2 debugger. And that person, if they had just barely made it to Level 4 overall, might now by your new definition come out as a Level 3 overall. Unfortunately for both of you, demotions are horrible (see next point) and so you’ll have to make a plan for what you’re going to do if your iterating negatively affects peoples’ levels. And even if you don’t actually change an implied level, you may have just wasted a lot of an engineer’s time by devaluing a skill the engineer may have just invested a great deal of effort improving.

While there’s no way you can get your framework exactly right the first time, you really should try to get it as right as possible so you can minimize the changes you need to make later.

3. Your Framework Musn’t Demote People

Demotions are horrible. They’re personally humiliating to the person being demoted. And they put management into the lose-lose choice of cutting the demoted person’s pay (which could be economically ruinous to them, if they have e.g. a mortgage based on their prior income), or to leave them obviously overpaid relative to peers at their new lower level.

Luckily for me, we had no formalized job titles before this rollout. So when I set Level 4 as “Senior Engineer” and made Level 4 a moderately late-career level, I didn’t have to go around and tell anyone that they needed to remove “Senior” from their job title.

But the very first question I got during my roll out presentation was “what will you do if you make a change to the definitions and it causes someone to lose a level?”

That was a very good question that I was not adequately prepared to answer. I was prepared to (and did) say “we’ve designed the framework to be parsimonious with ratings, specifically to minimize the odds that someone will get over-leveled.”

The part I was not prepared to say (but now believe is the only viable answer here) is “no one will be allowed to lose a level because of a change in the framework, even if the adjusted framework would imply that they should”. I prevaricated on this point because I hadn’t confirmed with my CTO that he’d be comfortable with that commitment.

But on reflection, I think that demotions are just so bad that you should never do them under virtually any circumstances. You especially shouldn’t do them because you the manager made a mistake. If you screwed up and over-promoted someone, the company should pay the literal and figurative price of supporting that person at the higher level until they (hopefully) grow into the higher set of expectations. On the plus side, if your framework is reasonably granular and you’re only making minor tweaks, then the only people who’d be in danger of slipping a level should be people who only just barely got promoted anyway. So it’s a reasonable bet that those people will organically cross the new line soon anyway and that their over-leveling will be short-lived.

You may be tempted to try to avoid this problem by making your framework more vague, thinking that will allow you to make mutually face-saving “fudges” to preserve people’s levels when you make a change. Don’t do this; vagueness in the standards undermines a large portion of the value of these frameworks (see next point).

4. Worry about Clarity, Not Complexity

My current version of our framework has ~33 attributes that apply to any given engineer (exactly which attributes apply depends on their specific engineering specialty). That sounds like a lot. Most other company’s frameworks I used as models had 5-10 attributes.

An intermediate draft of my framework had ~40 levels per engineer. In the final revision, I got cold feet, worried “this will confuse / overwhelm people” and cut/consolidated down to ~33 levels, and worried that might still be way too many.

That was a mistake — I wish I’d stuck with more levels.

Why? Because granularity allows clarity. Because I had so many attributes I could make the definition of “Level 3 reliability” or “Level 5 code reading” very specific and concrete. Those specific definitions made it relatively easy for engineers to run through each attribute and pick an appropriate rating.

In fact, the average time for an engineer to complete their self assessment was probably ~30 minutes. None of the engineers described the framework as complicated, overwhelming, or hard to understand. Several actually described it as faster and less burdensome than “traditional” “write your own review in prose” type performance reviews they’d had at previous employers.

When engineers did have trouble self-assessing on a specific attribute, it was generally for one of two reasons:

  1. The definition was too composite. E.g. I originally had “verbal communication” and “presenting” as separate attributes, but I consolidated them and ended up with level/attributes like “listens patiently and to understand; guides group conversations; comfortably gives clear topical presentations”. Some engineers clearly satisfied the first two points, but had not yet had the opportunity to give a presentation. So it was unclear what percentage of the points in a description they could miss and still hit the level, especially when the points they were missing seemed relatively minor. In the above example, if verbal communication and presenting had still been separate categories, I wouldn’t have needed to downrate the whole verbal communication attribute just because of limited presentation skills.
  2. The definition was too vague. E.g. I had one level definition along the lines of “writes code to solve complex problems” — what defines “complex?” Or I had another along the lines of “owns a medium size system” — what defines “medium”, or distinguishes a medium sized system from a major system? These terms are genuinely hard to define, but if you want to assess people on these attributes (and I bet you are assessing people on them in practice, whether or not you’re doing so explicitly), you’ve got to find a way to define your terms as clearly as you can.

This last point of vagueness is still an open point of concern for me. As long as I’m doing the sync-up assessments myself I can “voice over” the necessary definitions and provide clarity. But once we scale more and have more managers conducting these conversations, I will need to provide more explicit written definitions in order to ensure we’re apply consistent common standards among managers.

5. Start With A Self-Assessment

I mentioned above that my first trial run of my system with a direct report did not go well. A major issue there was that I set my ratings first and presented them to him without soliciting any self-assessment from him first. That left me blind to moments when his self-perception was out of line with my assessment of him.

Ultimately, I’m the manager and right or wrong, my assessment is the one that’s going in the final record. But I realized that by asking engineers to “go first” by doing a self-assessment, I could have an early warning when I was about deliver a rating that was going to be contentious. That allowed me to 1) prepare before the sync-up meeting, to gather as much evidence as I could to support my rating, 2) be tactful and empathetic in how I communicated my disagreement, 3) give the engineer time in advance to dredge their memory for examples that supported a given rating (which they could provide in writing in advance, or have handy-to-mind to discuss in our live conversation).

Remember that a primary goal of the whole exercise is to help the engineer grow, and engineers are less likely to grow if they walk out of a discussion furious that they were short-changed by my rating of their data management skills. Anger and resentment can easily short-circuit the parts of our brain that we need to in order to listen to and carefully consider feedback. If you can make your feedback easier to hear without materially compromising the essential fairness of the system then (within limits) that’s a reasonable tradeoff. Moreover stress can impair memory, and these conversations are unavoidably stressful even for engineers who are doing well. So the less you can rely on either your or the engineer’s in-the-moment memory to provide examples/evidence for discussion, the better.

While you want the conversations to be a two-way dialog, be particularly careful about letting the most confident people badger you relatively more often into higher ratings – you still need to keep the system fair even for the engineers who aren’t inclined to self-advocate.

6. Focus on Demonstrated Performance, Not Potential

Okay this wasn’t something I learned, but I think this is a key part of a successful framework. You cannot accurately predict people’s potential. Sometimes (like in an interview), you have to try and hope for the best. But you’ll often be wrong. And worse, you’ll often be wrong in biased ways (e.g. overrating the potential of a junior engineer who reminds you of yourself, and underrating someone from an underrepresented background that’s dissimilar to your own).

You do not want to find yourself in a debate where you’re telling an engineer that he would not (hypothetically) succeed at refactoring a major system, or that he would not (hypothetically) be successful collaborating on a cross-team project. You’ll find yourself implicitly (or explicitly) asking people to believe your judgement of them, and they won’t (and maybe they shouldn’t!)

So don’t put yourself in that situation. Instead, rate people on what they’ve concretely demonstrated. “You have not yet worked on a successful cross team project” is much harder to dispute than “you would not successfully collaborate on a cross team project”. And if someone responds with “well, actually I did collaborate on Project X with that other team and it went well” then great! You’re free to change your rating on the spot in light of new evidence.

Tell people ahead of time that you’re focusing on demonstrated performance. Rate people only on demonstrated performance. If someone gets a lower level than they’d like because they didn’t have an opportunity to demonstrate a certain proficiency (which becomes a bigger issue at higher levels, where opportunities for “high impact” moments are more rare), then work with that engineer to make a plan to guide her into the necessary opportunities to demonstrate what she needs to over the next six months.

Because discussions of demonstrated performance vs. an explicit set of clearly defined standards are concrete, specific, and factual, it’s harder for them to become too contentious (unless your definitions are too vague — see above).

(You may notice an intentional connection here to the idea of only giving feedback on behaviors “a video camera could capture”, or of the idea of “speaking objectively / inarguably” in Non-Violent Communication).

7. Framing Matters

Also not strictly a “lesson” but an idea that was definitely reinforced. Leveling is not about ranking. Ranking (particularly “stack ranking”) is destructive to teamwork – don’t do it. But when you give people numerical levels, there will be an urge for engineers to compare themselves to one another.

As the leader, you need to emphasize that this whole exercise is about 1) promoting fairness in recognition and 2) promoting individual growth. Be clear that it’s not perfect, and that there will always be mistakes and sometimes under and over-leveling will happen. But that is not the end of the world, because people’s numeric level is not intended (and will not be treated) as of their intrinsic worth or even of their potential.

It’s just a tool to 1) help people understand what kind of behaviors will make them more successful in your particular organization and 2) to help management identify when people who are performing similarly are not being recognized similarly (“similarly”, not “the same”, because no two people’s proficiencies at a given moment in time will be exactly the same).

To help set the right framing, you can make people’s integer levels (and associated titles) public, but don’t make their full rating matrix public. It’s okay for Suzie and Richard to know that they’re both Level 3, even if Suzie thinks she’s a bit better than Richard. It’s unproductive for Richard to obsess over whether it’s fair that he’s really a Level 2 verbal communicator when he knows Suzie is rated as a Level 4 communicator.

Another minor but probably useful technique that was suggested to me is to call this entire system a “growth framework” instead of a “leveling framework”. I disregarded that advice and initially introduced it to the team as a leveling framework, but have since mostly referred to it internally as “our career growth framework”.

I personally prefer the term “level” because I think it’s more clear about what the system literally is. But “growth framework” seems to have less baggage and to generally promote a healthier reception among the team (particularly, it reduces the inclination to think about leveling as equivalent to ranking).

See You In Six Months

I’m sure I missed some learnings, and I’m sure I’ll learn a lot more over the coming months and years as I get to see this system play out over time (and perhaps I’ll have an update to this post to write then). But I think it’s worth sharing these early lessons in hopes that someone reading this can learn from and avoid some of my mistakes.

Obviously I’ve spent a lot of time thinking about this topic, so I’d be happy to chat with anyone else who’s thinking about rolling out a framework. I got quite a bit of advice and input from friends and peers as I was planning this roll out. I’d be happy for the chance to pay that help forward!

Import Madness

Sing, goddess, the rage of George and the ImportError,

and its devastation, which put pains thousandfold upon his programs,

hurled in their multitudes to the house of Hades strong ideas

of systems, but gave their code to be the delicate feasting

of dogs, of all birds, and the will of Guido was accomplished

since that time when first there stood in division of conflict

Brett Cannon’s son the lord of modules: the brilliant import keyword…

Backstory Before Things Get Weird

This post is about how an ImportError lead me to a very strange place.

I was writing a simple Python program. It was one of my first attempts at Python 3.

I tried to import some code and got an ImportError. Normally I solve ImportError’s by shuffling files around until
the error goes away. But this time none of my shuffling solved the problem. So I found myself actually reading the
official documentation for the Python import system. Somehow I’d spent over five years writing Python code professionally
without ever reading more than snippets of those particular docs.

What I learned there changed me.

Yes, I answered my simple question, which had something to do with when I should use .’s in an import:

# most of the time, don't use dots at all:
from spam import eggs

# If Python has trouble finding spam and spam.py is in the same directory:
# (i.e. "package" as the code doing the import):
from .spam import eggs

# when spam.py is in the *enclosing* package, i.e. one level up:
from ..spam import eggs

# ImportError! At least in Python 3, you can only use the dots with
# the `from a import b` syntax:
import .spam

But more importantly I realized that this whole time I had never really understood what the word module means in Python.

According to the official Python tutorial, a “module” is a file containing Python definitions and statements.

In other words, spam.py is a module.

But it’s not quite that simple. In my running Python program, if I import requests, then what is type(requests)?

It’s module.

That means module is a type of object in a running Python program. And requests in my running program is derived from requests.py, but it’s not the same thing.

So what is the module class in Python and how is babby module formed?

Modules and the Python Import System

Modules are created automatically in Python when you import. It turns out that the import keyword in Python is syntactic sugar for a somewhat more complicated process. When you import requests, Python actually does two things:

1) Calls an internal function: __import__('requests') to create, load, and initialize the requests module object

2) Binds the local variable requests to that module

And how exactly does __import__() create, load, and initialize a module?

Well, it’s complicated. I’m not going to go into full detail, but there’s a great video where Brett Cannon, the main maintainer of the Python import system, painstakingly walks through the whole shebang.

But in a nutshell, importing in Python has 5 steps:

1. See if the module has already been imported

Python maintains a cache of modules that have already been imported. The cache is a dictionary held at sys.modules.

If you try to import requests, __import__ will first check if there’s a module in sys.modules named “requests”. If there is, Python just gives you the module object in the cache and does not do any more work.

If the module isn’t cached (usually because it hasn’t been import yet, but also maybe because someone did something nefarious…) then:

2. Find the source code using sys.path

sys.path is a list in every running Python program that tells the interpreter where it should look for modules when
it’s asked to import them. Here’s an excerpt from my current sys.path:

# the directory our code is running in:
# where my Python executable lives:
# the place where `pip install` puts stuff:

When I import requests Python goes and looks in those directories for requests.py. If it can’t find it, I’m in for an ImportError. I’d estimate that the large majority of real life ImportError’s happen because the source code you’re
trying to import isn’t in a directory that’s on sys.path. Move your module or add the directory to sys.path and you’ll have a better day.

In Python 3, you can do some pretty crazy stuff to tell Python to look in esoteric places for code. But that’s a topic for another day!

3. Make a Module object

Python has a builtin type called ModuleType. Once __import__ has found your source code, it’ll create a new ModuleType instance and attach your module.py’s source code to it.

Then, the exciting part:

4. Execute the module source code!

__import__ will create a new namespace, i.e. scope, i.e. the __dict__ attribute attached to most Python objects.
And then it will actually exec your code inside of that namespace.

Any variables or functions that get defined during that execution are captured in that namespace. And the namespace is
attached to the newly created module, which is itself then returned into the importing scope.

5. Cache the module inside sys.modules

If we try to import requests again, we’ll get the same module object back. Steps 2-5 will not be repeated.

Okay! This is a pretty cool system. It lets us write many pretty Python programs.

But, if we’re feeling demented, it also lets us write some pretty dang awful Python programs.

Where it gets weird

I learned how to fix my immediate import problem. That wasn’t enough.

Gizmo gets wet

With these new import powers in hand, I immediately starting thinking about how I could use them for evil, rather than good. Because, as we know:

Good is dumb
(c. Five Finger Tees)

So far, the worst idea I’ve had for how to misuse the Python import system is to
implement a mergesort algorithm using just the import keyword. At first I didn’t know if it was possible. But, spoiler alert, it is!

It doesn’t actually take much code. It just takes the stubbornness to figure out how to subvert a lot of very well-intentioned, normally helpful machinery in the import system.

We can do this. Here’s how:

Remember that when we import a module, Python executes all the source code.

So imagine I start up Python and define a function:

>>> def say_beep():
>>>    print("beep!.........beep!")

>>> say_beep()

This will print out some beeps.

Now imagine instead I write the same lines of code as above into a file called say_beep.py. Then I open my interpreter and run

>>> import say_beep.py

What happens? The exact same thing: Python prints out some beeps.

If I create a module that contains the same source code as the body of a function then importing the module will produce the same result as calling the function.

Well, what if I need to return something from my function body? Simple:

# make_beeper.py

beeper = lambda x: print("say beep")

# main.py

from make_beeper import beeper

Anything that gets defined in the module is available in the module’s namespace after it’s imported. So from a import b
is structurally the same as b = f(), if I structure my module correctly.

Okay, what about passing arguments? Well, that gets a bit harder. The trick is that Python source code is just a long string, so we
can modify the source of a module before we import it:

# with_args.py

a = None
b = None
result = a + b

# main.py

src = ""
with open("with_args.py") as f:
    for line in f:
        src += line

a = "10"
b = "21"

src = src.replace("a = None", f"a = {a}")
src = src.replace("b = None", f"b = {b}")

with open("with_args.py", "w") as f:

from with_args import result

print(result)  # it's 31!

Now this certainly isn’t pretty. But where we’re going, nothing is pretty. Buckle up!

How to mergesort

Okay…how can we apply these ideas to implement mergesort?

First, let’s quickly review what mergesort is: it’s a recursive sorting algorithm with n log n worst-case computational complexity
(meaning it’s pretty darn good, especially compared to bad sorting algorithms like bubble sort that have n^2 complexity.)

It works by taking a list, splitting it in half, and then splitting the halves in half until we’re left with individual elements.

Then we merge adjacent elements by interleaving them in sorted order. Take a look at this diagram:

picture of mergesort

Or read the Wikipedia article for more details.

Some rules

  1. No built in sorting functionality. Python’s built in sort uses a derivative of mergesort
    so just putting result = sorted(lst) into a module and importing it isn’t very sporting.
  2. No user-defined functions at all.
  3. All the source code has to live inside one module file, which we will fittingly call madness.py

The code

Well, here’s the code: (Walk-through below, if you don’t feel like reading 100 lines of bizarre Python)

# This is the algorithm we'll use:

import sys
import re
import inspect
import os
import importlib
import time

input_list = []
sublist = input_list
is_leaf = len(sublist) < 2
if is_leaf:
    sorted_sublist = sublist
    split_point = len(sublist) // 2
    left_portion = sublist[:split_point]
    right_portion = sublist[split_point:]

    # get a reference to the code we're currently running
    current_module = sys.modules[__name__]

    # get its source code using stdlib's `inspect` library
    module_source = inspect.getsource(current_module)

    # "pass an argument" by modifying the module's source
    new_list_slug = 'input_list = ' + str(left_portion)
    adjusted_source = re.sub(r'^input_list = [.*]', new_list_slug, 
                             module_source, flags=re.MULTILINE)

    # make a new module from the modified source
    left_path = "left.py"
    with open(left_path, "w") as f:

    # invalidate caches; force Python to do the full import again
    if "left" in sys.modules:
        del sys.modules['left']

    # "call" the function to "return" a sorted sublist
    from left import sorted_sublist as left_sorted

    # clean up by deleting the new module
    if os.path.isfile(left_path):

    new_list_slug = 'input_list = ' + str(right_portion)
    adjusted_source = re.sub(r'^input_list = [.*]', new_list_slug, 
                             module_source, flags=re.MULTILINE)
    right_path = "right.py"
    with open(right_path, "w") as f:


    if "right" in sys.modules:
       del sys.modules['right']
    from right import sorted_sublist as right_sorted

    if os.path.isfile(right_path):

    # merge
    merged_list = []
    while (left_sorted or right_sorted):
        if not left_sorted:
            bigger = right_sorted.pop()
        elif not right_sorted:
            bigger = left_sorted.pop()
        elif left_sorted[-1] >= right_sorted[-1]:
            bigger = left_sorted.pop()
            bigger = right_sorted.pop()
    # there's probably a better way to do this that doesn't
    # require .reverse(), but appending to the head of a
    # list is expensive in Python
    sorted_sublist = merged_list

# not entirely sure why we need this line, but things
# don't work without it!
sys.modules[__name__].sorted_sublist = sorted_sublist

import random
import os
import time


list_to_sort = [int(1000*random.random()) for i in range(100)]
print("unsorted: {}".format(list_to_sort))

mergesort = __doc__
adjusted_source = mergesort.replace('input_list = []',
                                    'input_list = {}'.format(list_to_sort))

with open("merge_sort.py", "w") as f:

from merge_sort import sorted_sublist as sorted_list

finished_time = time.time()

print("original sorted: {}".format(sorted(list_to_sort)))
print("import sorted: {}".format(sorted_list))

assert sorted_list == sorted(list_to_sort)

That’s all we need.

Breaking it down

Madness itself

The body of madness.py is compact. All it does is generate a random list of numbers, grab our template implementation of merge sort from it’s own docstring (how’s that for self-documenting code?), jam in our random list, and kick off the algorithm by running

from merge_sort import sorted_sublist as sorted_list

The mergesort implementation

This is the fun part.

First, here is a “normal” implementation of merge_sort as a function:

def merge_sort(input_list):
    if len(input_list) < 2:  # it's a leaf
        return input_list
        # split
        split_point = len(input_list) // 2
        left_portion, right_portion = input_list[:split_point], input_list[split_point:]

        # recursion
        left_sorted = merge_sort(left_portion)
        right_sorted = merge_sort(right_portion)

        # merge
        merged_list = []
        while left_sorted or right_sorted:
            if not left_sorted:
                bigger = right_sorted.pop()
            elif not right_sorted:
                bigger = left_sorted.pop()
            elif left_sorted[-1] >= right_sorted[-1]:
                bigger = left_sorted.pop()
                bigger = right_sorted.pop()
        return merged_list

It has three phases:

  1. Split the list in half
  2. Call merge_sort recursively until the list is split down to individual elements
  3. Merge the sublists we’re working on at this stage into a single sorted sublist by interleaving the elements in sorted order

But since our rule says that we can’t use functions, we need to replace this recursive function with import.

That means replacing this:

left_sorted = merge_sort(left_portion)

With this:

# get a reference to the code we're currently running
current_module = sys.modules[__name__]
# get it's source code using stdlib's `inspect` library
module_source = inspect.getsource(current_module)

# "pass an argument" by modifying the module's source
new_list_slug = 'input_list = ' + str(left_portion)
adjusted_source = re.sub(r'^input_list = [.*]', new_list_slug, module_source, flags=re.MULTILINE)

# make a new module from the modified source
left_path = "left.py"
with open(left_path, "w") as f:

# invalidate caches
if "left" in sys.modules:
    del sys.modules['left']

# "call" the function to "return" a sorted sublist
from left import sorted_sublist as left_sorted

# clean up by deleting the new module
if os.path.isfile(left_path):

# not entirely sure why we need this line, but things
# don't work without it! Might be to keep the sorted sublist
# alive once this import goes out of scope?
sys.modules[__name__].sorted_sublist = sorted_sublist

And that’s really it.

We just use the tools we learned about to simulate calling functions with arguments and returning values. And we add a few lines to trick Python into not caching modules and instead doing the full import process when we import a module with the same name as one that’s already been imported. (If our merge sort execution tree has multiple levels, we’re going to have a lot of different left.py’s).

And that’s how you abuse the Python import system to implement mergesort.

Many paths to the top of the mountain, but the view is a singleton.

It’s pretty mindblowing (to me at least) that this approach works at all. But on the other hand, why shouldn’t it?

There’s a pretty neat idea in computer science is the Church-Turing thesis. It states that any effectively computable function can be computed by a universal Turing machine. The thesis is usually trotted out to explain why there’s nothing you can compute with a universal Turing machine that you can’t compute using lambda calculus, and therefore there’s no program you can write in C that you can’t, in principle, write in Lisp.

But here’s a corollary: since you can, if you really want to, implement a Turing tape by writing files to the file system one bit at a time and importing the results, you can use the Python import system to simulate a Turing machine. That implies that,
in principle, any computation that can be performed by a digital computer can be performed (assuming infinite space, time, and patience) using the Python import system.

The only real question is how annoying a computation will be to implement, and in this case Python’s extreme runtime dynamism makes this particular implementation surprisingly easy.

The Python community spends a lot of time advocating for good methodology and “idiomatic” coding styles. They have a good reason: if you’re writing software that’s intended to be used, some methods are almost always better than their alternatives.

But if you’re writing programs to learn, sometimes it’s helpful to remember that there are many different models of computation under the sun. And especially in the era when “deep learning” (i.e. graph-structured computations that simulate differentiable functions) is really starting to shine,
it’s extra important to remember that sometimes taking a completely different (and even wildly “inefficient”) approach to
a computational problem can lead to startling success.

It’s also nice to remember that Python itself started out as (and in a sense still is!) a ridiculously inefficient
and roundabout way to execute C code.

Abstractions really matter. In the words of Alfred North Whitehead,

Civilization advances by extending the number of important operations which we can perform without thinking about them

My “import sort” is certainly not a useful abstraction. But I hope that learning about it will lead you to some good ones!

Note Bene

In case it’s not obvious, you should never actually use these techniques in any code that you’re intending to actually use for anything.

But the general idea of modifying Python source code at import time has at least one useful
(if not necessary advisable) use case: macros.

Python has a library called macropy that implements Lisp-style syntactic macros in Python
by modifying your source code at import time.

I’ve never actually used macropy, but it’s pretty cool to know that Python makes the simple things easy and the insane things possible.

Finally, as bad as this mergesort implementation is, it allowed me to run a fun little experiment. We know that
mergesort has good computation complexity compared to naive sorting algorithms. But how long does a list have to be before a standard implementation of bubble sort runs slower than my awful import-based implementation of mergesort? It turns out that a list only has to be about 50k items long before “import sort” is faster than bubble sort.

Computational complexity is a powerful thing!

All the code for this post is on Github

“Unresolved identifier” in Swift when importing Frameworks using Cocoapods

The latest Cocoapods (0.36) has a nifty feature: it allows you to import pods written in Swift (such as the networking library Alamofire).

It does this by asking you to insert a little line into your Podfile:


If you, like me, are new to iOS development it might not be obvious to you what that line does. It converts all of your pods from being static libraries into being frameworks.

I haven’t gotten around to reading the absurdly long and dense Framework Programming Guide but as best I understand, that means that whereas once your pods were snippets of code that were compiled and rolled directly into your project binary, they are now instead separate folders of code that your app is “aware are over there in their own special place, somewhere”.

One consequence of this transformation from libraries into frameworks is that all of your references to Objective-C Cocoapods classes inside of your Swift code will mysteriously stop working. Of course they won’t tell you why they’ve stopped working (that would be silly!). Your Swift files will just suddenly fail to compile, complaining that all your references to these Cocoapod classes are now “unresolved identifiers”.

The solution to this problem isn’t obvious (at least to me), but it is easy. So let me save you the couple of hours it took me to figure it out.


Normally when you’re importing Objective-C code into Swift, you do so by including the header of the file containing that code in the “Bridging Header” for your project. And that is indeed how you include code from a static library (which your pods used to be.)

But it is not how your import Objective-C code from a Framework. To do that you simply type…

import Framework

…inside your Swift file that’s using the Objective-C class (where “Framework” is the name of the actual Framework containing the class.)

That’s it. Good luck!

If you earn $23,500/year, the Apple Watch pays for itself.

Apple Pay? More like Apple Pays For Itself! Amiright?

Okay you can stop reading now.

Or you can continue and learn how I’ve actually calculated that number using science (i.e. estimation + numerology.)

The numbers

SERIOUSLY THOUGH, I watched Apple’s livestream on Monday where they revealed slightly more detail about “the watch”, a.k.a. the Apple Watch, a.k.a. a beautifully manufactured, very elegant looking and shiny gizmo that costs a minimum of $349 and in my expert opinion doesn’t do very much.

The Pretty

Okay sure it does stuff. It lets you send scribbles of trollfaces to your friends and then use Siri to text them an audio recording of your voice to let them know you’ve scribbled them a trollface. And then you can scan your watch to pass through security at the airport, saving you the several seconds that you’ll then need to spend removing your watch to go through the body scanner.

It struck me that almost everything one can currently do with the Apple Watch, one can already do with the iPhone that you must carry at all times if you want your watch to work. Now looking at what a device can currently do or currently costs is a bad way to predict the future. But it’s a pretty good way to predict the present. And at present, the Watch mostly just saves you from having to pull out your phone.

So I was left wondering “exactly how valuable is that?” Well…turns out it’s more valuable than I expected.

People actually spend rather a lot of time pulling out their phones. By some estimation, we do it 150 times per day on average, which tallies up to ~76 hours per year we spend simply removing our phones from our pockets and putting them back in. The Apple Watch offers to give us some of that time back. Not all of it (since many notifications still need the phone to fully address). But some of it.

And how much is that time worth? That depends on how much your individual time is worth but by my estimation, it’s about ~$500/year if you earn a typical Slicing Valley salary. So if we assume that watch has a 3 year replacement cycle (i.e. more like a Macbook than an iPhone), the watch will pay for itself about 4 times over in time it saves the people who might be reading this post. And the minimum you can make before the economic argument disappears is…$23,500. Below that, the hourly value of your time isn’t enough to buy an (entry-level) Apple Watch even if you could cash in every minute you saved.

And there, of course, is the rub(…ber watch band!) This calculation is based on a lot of (plausible) assumptions about how intensively people use their phones and on one big (much less plausible) assumption that the right way to value “time saved” is in comparison to an hourly wage. With something like the Apple Watch, the time saved will come in many, many scattered five-second increments. Most people, whether salaried or hourly, don’t actually have the ability to convert free time into money by simply working more. And those who do will mostly not be able to convert 720 five-second intervals into a full hour of productive work.

Or maybe they will, as soon as someone invents a Mechanical Turk app for the Watch that lets us monetize our milliseconds. Guess we’ll have to watch and see.

Don’t believe my numbers? Have a look at my calculations!

Glance at me over on twitter @rogueleaderr for more mumblings.

$1 Billion and Change

While getting $0.71 of change at a coffee shop today, I started to wonder just how much time is actually consumed by the act of “getting change”.

So I calculated it, and the answer is that $1 billion of time is consumed in the USA waiting for change.

That’s actually a bit smaller than I expected. And I calculate the total number of retail purchases at ~48 billion/year, so assuming $0.50 cents of change per purchase, it would cost consumers $24 billion per year to say “keep the change”.

I guess I’ll keep waiting for payment practices to change!

June 13, 2014 at 8:57:15 PM
retail sales = ($400 × 10^9) × 12
avg purchase = $30
percent cash = 30%
total cash purchases = retail sales / avg purchase × percent cash
seconds per change = 4 seconds
4 seconds
time spent = total cash purchases × change
1.92×1011 seconds
total_hours = spent in hours
53,333,333.3333333333 hours
avg_wage = 20$ / hour
20 $/hour
cost = total_hours × avg_wage

The Books Behind Me During the A* Interviews

Since I was asked on Twitter. It should be noted that that shelf is where I keep my queue of books I have purchased but not yet read.

Pro Django – Alchin

Ready Player One – Ernest Cline

Cracking the Coding Interview – Laakmann, McDowell

From Counterculture to Cyberculture – Fred Turner

About Face 3, Essentials of Interaction Design – Cooper

Four Hour Body – Tim Ferris

The Paelo Manifesto – John Durant

Permutation City – Greg Egan

Antifragile – Nassim Taleb

The Feynman Lectures on Physics, Vol 1 – Richard Feynman

Surfaces and Essenceses – Hofsteadter

The Philosophy of Symbolic Forms – Cassirer

The Elements of Typographic Style – Bringhurst

Universal Principles of Design – Lidwell

A Tale of Two Cities – Dickens

The Success Equation – Maubousin

In the Heart of the Sea – Philbrick

Masters of Doom – Kushner

Coders at Work – Seibel

Customers Included – Hurst and Terry

Godel, Escher, Bach – Hofstadter

Futility – Gerhardie

The Frogs -Aristophanes

Den of Thieves – Stewart

Modernism – Peter Gay

Dithyrambs of Dionysus – Nietzsche

The Last Tycoon – Fitzgerald

Essays in Experimental Logic – Dewey

The Magic Mountain -Mann

Speech and Language Processing – Jurafsky, Martin

Database Management Systems – Gehrke

Semantic Web for the Working ontologist – Allemang, Hendler

The Entrepreneur’s Guide to Business Law – Danchy

Much obliged to any commenter who’d like to provide the Amazon links.

Future of the Economy Part 5 – Lateral Productivity Growth

Back in the sky, back at the keyboard. Last time in this series, I made two claims:

1) The only way to consistently improve human well-being is to foster productivity growth.

2) Productivity only grows when people invent better methods of production (i.e better technology).

The economy is just a machine that turns raw commodities (e.g. iron) into
consumable products (e.g. corn flakes). The output of any given machine is
limited by how fast the machine can operate; a miner can only dig so fast and a
CPU can only cycle so fast. Machines tend to be made faster over time, but
usually at pretty slow and steady rate.

Productivity Growth is Steady

And consequently annual productivity growth over the last 100 years has been remarkable consistent at around 1-2%.
If we want productivity to grow faster than that, we can’t just speed up existing
processes. We have to implement completely new ones. Instead of getting faster
at harvesting bat guano, we need to invent synthetic fertilizer.

So…how do we do that? That’s the key question in this whole series.

Well, the way that humans do it through a specific type of intelligence commonly
called “lateral reasoning” (or sometimes just “creativity”.) Everyone has an
intuitive idea of what creative intelligence is: in a classic test, a child is
given a paperclip and asked to write down as many ways of using the paperclip as
she can. And from there it’s a pretty short leap to “how can I build a CO2
filter out of duct tape and a flight manual?”

Unlike its close cousin linear reasoning (i.e. 1+1=?), lateral reasoning is
rather poorly understood. So much so that it’s often treated with a sort of
mystic reverence
(cf. “a flash of inspiration”). And
it’s the last unironic refuge of the word “genius” in popular discourse (cf. “the creative
genius Steve Jobs”). And while computers have come to dominate humans at classical
intelligence tests like
the most advanced computer in the world can’t figure out how to fix a toaster [2].

But lateral reasoning is not magic. It actually works pretty much the same way as linear reasoning.

Consider a Chess game:

  1. Start in some situation, e.g. in check, down a knight.
  2. Using your knowledge of the rules, consider all legal moves (or use heuristics to only consider a subset).
  3. Imagine the sequence of potential consequences of each action and calculate the most promising path.

Now consider a lateral problem:

  1. Start in some situation, e.g. on a desert island with a can of beans and a rock.
  2. Using your knowledge of how the world works, figure out potential “moves” you can make, e.g. “smash the can with a rock”.
  3. Consider the consequences of your options and choose the best action.

The only difference is that linear problems tend to involve relatively
simple, clearly specified situations, a small numbers of simple rules, and a potentially enormous sequence of steps to solve. Whereas lateral
problems can often be solved in just a few steps but involve complex, nebulous situations
and an enormous number of complicated, underspecified rules.

Computers, at present, are fantastically adapted for the former type of problem
and terribly adapted for the later. But that’s going to change fast (even if I
have to change it myself.) Over the coming decades, we’re going to see an
explosion of computers designed to extend human lateral intelligence. And that’s
going to produce productivity gains unlike anything we’ve seen before.

Next time, I’ll tell you how it’s all going to happen.

[1] Even semi-exceptions like Moore’s law tend to be steady even if they aren’t slow.

[2] Unless Google can find an exact recipe some human wrote down.