You're just running into the slowness of Python (or rather the CPython interpreter I should say I guess). From wikipedia:
NumPy targets the CPython reference implementation of Python, which is a non-optimizing bytecode compiler/interpreter. Mathematical algorithms written for this version of Python often run much slower than compiled equivalents. NumPy seeks to address this problem by providing multidimensional arrays and functions and operators that operate efficiently on arrays. Thus any algorithm that can be expressed primarily as operations on arrays and matrices can run almost as quickly as the equivalent C code.
And from the Scipy FAQ:
Python’s lists are efficient general-purpose containers. They support (fairly) efficient insertion, deletion, appending, and concatenation, and Python’s list comprehensions make them easy to construct and manipulate. However, they have certain limitations: they don’t support “vectorized” operations like elementwise addition and multiplication, and the fact that they can contain objects of differing types mean that Python must store type information for every element, and must execute type dispatching code when operating on each element. This also means that very few list operations can be carried out by efficient C loops – each iteration would require type checks and other Python API bookkeeping.
Note this doesn't concern only Python; for more background see e.g. this and this question on SO.
Due to the overhead from the dynamic type system and the interpreter, Python would be a lot less usefull for high performance number crunching, if it wouldn't be able to tap into all sorts of compiled C and Fortran libraries (e.g. Numpy). Also, there are JIT compilers like Numba and PyPy that try to get Python code to execute closer to the speeds of statically typed, compiled code.
Bottomline: You're doing to much in plain Python relative to the work that you're offloading to fast C code. I suppose you need to adopt more like an "array oriented" coding style rather than object oriented to achieve good performance with Numpy (MATLAB is a very similar story in this regard). On the other hand, if you would use a more efficient algorithm (see the answer by Ara) then the slowness of Python might not be such an issue.