Pascal’s Triangle

Summary

This article is about Pascal’s Triangle building algorithm implementations, their performance and limitations analysis. The article shows some techniques for performance optimization that can be used for solving real life software optimization problems.
Programming languages:
  • Python (CPython and PyPy implementations)
  • Cython
  • C
  • Assembler
Optimization techniques used:
  • Microoptimizations
  • Replace recursive algorithm with non-recursive
  • Replace Python with other programming languages

Conclusions

  • It is possible to improve performance for more than 200 times if you are ready to switch from CPython to C or Assembler
  • You probably should not try to implement your algorithm in Assembler for better performance unless you are as proficient in Assembler as a C-compiler developer, otherwise just use compiler optimization options
  • It was possible to improve CPython implementation by 40% by switching to non-recursive implementation and other language specific optimizations
  • It is possible to improve performance by 20-50% by switching to Cython with no changes to source code
  • It is possible to improve performance for 2-7 times by switching to PyPy with no changes to source code
  • As expected from fastest to slowest: Assembler and C, PyPy, Cython, CPython
  • You should always check if your performance optimization leads to improvement, because sometime unexpectedly they lead to performance degradation
  • Faster implementations may have limitations that make them impractical. For example CPascalTriangleFullAsm has the best time performance, but limited to 34 height. Therefore it is more practical to cache or precalculate Pascal’s Triangle of this height and return result from memory which would would run even faster than the Assembler implementation.

Implementation Notes

  • Complete source code can be found in pascal_triangle github repository
  • This article is automatically generated with Sphinx
  • Metaclasses are used for unit-testing variety of solutions of the same problem
  • Specific property of Pascal’s Triangle is used to test the solutions: sum of line’s integers is equal to 2 to the power of line number
  • cythonize() tries to compile *.pyx files before Cython is installed according to setup_requires argument of setup(), so there is a need to have a LazyList() as work around
  • Cython is incompatible with PyPy, so there is a need to check for python implementation being used
  • It is important to set constant CPU frequency to get stable benchmark results

Full story

The basic definition of Pascal’s Triangle is the following:
Pascal’s Triangle is a triangle formed of lines of integer numbers. Each line has one more integer than the previous line and centered with the previous line. First line has only one integer with value of 1. Every integer in each line is a sum of two above integers.
Example:
      1
    1   1
   1  2  1
  1 3   3 1
 1 4  6  4 1
1 5 10 10 5 1
For more strict mathematical definition, please, see Pascal’s Triangle wikipedia article.
Here is the code that I produced on a job interview when I was asked to code a program in Python that prints Pascal’s Triangle:
def print_pascal_triangle_original(n):
    """
    Print Pascal's Triangle of height n: original interview solution.
    :param n: height of the triangle to print
    :return: n-th line of Pascal's Triangle
    """
    if n <= 1:
        print [1]
        return [1]

    a = [0] + print_pascal_triangle_original(n - 1) + [0]
    b = [a[i] + a[i + 1] for i in xrange(len(a) - 1)]
    print b
    return b
For the purpose of this article I made small changes to my original implementation (zero-based lines and function is converted to a class method):
from .base import PyPascalTriangleBase


class PyPascalTriangleOriginal(PyPascalTriangleBase):
    """
    Modified original solution presented on interview.
    """

    original = True
    max_height = 900

    def build(self, height):
        if height <= 0:
            self._print([1])
            return [1]

        a = [0] + self.build(height - 1) + [0]
        b = [a[i] + a[i + 1] for i in xrange(len(a) - 1)]
        self._print(b)
        return b  # return last line for testing purposes

Limitations

When I started implementation of the Pascal’s Triangle building algorithm it turned out that different implementations have their own limitations. Therefore limitations should be described first.
Here is the result of limitation test:
$ python -m pascal_triangle.tests.limitation_test
# Implementation Language Height limit Duration, s Reason Info
1 PyPascalTriangleOriginal Python 993 0.053755 exception RuntimeError(‘maximum recursion depth exceeded’,)
2 PyPascalTriangleOriginalPlus Python 993 0.053977 exception RuntimeError(‘maximum recursion depth exceeded’,)
3 PyPascalTriangleConstantLists Python 993 0.054690 exception RuntimeError(‘maximum recursion depth exceeded’,)
4 PyPascalTriangleConstantTuples Python 993 0.070621 exception RuntimeError(‘maximum recursion depth exceeded’,)
5 PyPascalTriangleIterators Python 993 0.047875 exception RuntimeError(‘maximum recursion depth exceeded’,)
6 PyPascalTriangleNonRecursive Python 3072 0.746115 timeout 1.000000 s, (duration: 1.474587 s)
7 PyPascalTriangleNonRecursiveCLike Python 3072 0.935708 timeout 1.000000 s, (duration: 1.883892 s)
8 PyPascalTriangleNonRecursiveCLikeImproved Python 3072 0.846512 timeout 1.000000 s, (duration: 1.728939 s)
9 PyPascalTriangleNonRecursiveLessCLike Python 3072 0.729132 timeout 1.000000 s, (duration: 1.533341 s)
10 PyPascalTriangleNonRecursiveArray Python 34 0.000096 exception OverflowError(‘unsigned int is greater than maximum’,)
11 PyPascalTriangleNonRecursiveIterators Python 3072 0.678832 timeout 1.000000 s, (duration: 1.382548 s)
12 CyPascalTriangleIterators Cython 992 0.036476 exception RuntimeError(‘maximum recursion depth exceeded’,)
13 CyPascalTriangleConstantLists Cython 992 0.043471 exception RuntimeError(‘maximum recursion depth exceeded’,)
14 CyPascalTriangleNonRecursive Cython 3072 0.610665 timeout 1.000000 s, (duration: 1.275582 s)
15 CyPascalTriangleNonRecursiveIterators Cython 3072 0.552308 timeout 1.000000 s, (duration: 1.181184 s)
16 CyPascalTriangleNonRecursiveCLikeImproved Cython 3072 0.690947 timeout 1.000000 s, (duration: 1.453662 s)
17 CyPascalTriangleNonRecursiveCLikeMoreImproved Cython 3072 0.728800 timeout 1.000000 s, (duration: 1.485473 s)
18 CyPascalTriangleNonRecursiveLessCLike Cython 3072 0.614456 timeout 1.000000 s, (duration: 1.314016 s)
19 CyPascalTriangleNonRecursiveCTypes Cython 34 0.000044 exception OverflowError(‘unsigned int is greater than maximum’,)
20 CyPascalTriangleNonRecursiveCTypesULong Cython 67 0.000178 exception OverflowError(‘long int too large to convert’,)
21 CPascalTriangleInitial C 34 0.000002 exception ValueError(“Unable to build Pascal’s Triangle higher than 34”,)
22 CPascalTriangleULong C 67 0.000006 exception ValueError(“Unable to build Pascal’s Triangle higher than 67”,)
23 CPascalTrianglePartialAsm C/Assembler 34 0.000003 exception ValueError(“Unable to build Pascal’s Triangle higher than 34”,)
24 CPascalTriangleFullAsm C/Assembler 34 0.000002 exception ValueError(“Unable to build Pascal’s Triangle higher than 34”,)
As seen from the table there are several thresholds of triangle height limitation:
  • 34 is a height limit for all algorithms that use unsigned 32-bit integers to store resulting lines.
  • 67 is a height limit for all algorithms that use unsigned 64-bit integers to store resulting lines.
  • A just below 1000 height limit is valid for all recursive algorithms in Python since the default recursion limit in python is 1000 frames.
  • The other implementations are virtually limited by the time required for calculation (1 second).
From practical point of view limit of 34 and 67 height makes corresponding algorithms useless since lines Pascal’s Triangle of 67 height can be precomputed and stored in memory with any other algorithm. It would take just (height + 2) * (height + 1) / 2 * 8 = (67 + 2) * (67 + 1) / 2 * 8 = 18 768 bytes for 64-bit implementation.
So all optimizations that are limited by 64-height make only scientific interest.
Another conclusion from limitation test is that we can only compare implementations for particular heights. There will be 4 benchmarks of different heights: 34, 67, 900, 3000.

Performance

CPython

  • The best implementation is Assembler x86 implementation
  • Non-recursive implementations do better than recursive implementations
  • Cython implementations do better than Python implementations
  • C implementations do better than Cython implementations
  • Assembler x86 implementation does better than C implementations
  • Best implementation is about 100 times faster than the original implementation
  • Best C implementation is about 50 times faster than the original implementation
  • Best Cython implementation is about 90% faster than the original implementation
  • Best Python implementation is about 40% faster than the original implementation
  • PyPascalTriangleNonRecursive, PyPascalTriangleNonRecursiveLessCLike and PyPascalTriangleNonRecursiveCLikeImproved are three best Python implementations
  • Cython implementation that used C type is not the best Cython implementation
  • 32-bit vs 64-bit integers do not make notable difference
  • Sometimes more optimized Python implementation can do better than less optimized Cython implementation
  • Some “optimizations” lead to up to 30% performance loss
Each implementation.build(34) called for 1000 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 CPascalTriangleFullAsm C/Assembler 0.000718 0.009341 10606.042497
2 CPascalTrianglePartialAsm C/Assembler 0.001081 0.014060 7012.174680
3 CPascalTriangleULong C 0.001115 0.014504 6794.718837
4 CPascalTriangleInitial C 0.001393 0.018117 5419.787744
5 CyPascalTriangleNonRecursive Cython 0.040292 0.524077 90.811671
6 CyPascalTriangleNonRecursiveCTypes Cython 0.041818 0.543927 83.848162
7 CyPascalTriangleNonRecursiveCLikeImproved Cython 0.042011 0.546433 83.005119
8 CyPascalTriangleNonRecursiveLessCLike Cython 0.042353 0.550883 81.526787
9 CyPascalTriangleNonRecursiveCLikeMoreImproved Cython 0.042596 0.554046 80.490426
10 CyPascalTriangleNonRecursiveCTypesULong Cython 0.045652 0.593793 68.408903
11 CyPascalTriangleNonRecursiveIterators Cython 0.049919 0.649293 54.013612
12 CyPascalTriangleIterators Cython 0.052587 0.683998 46.199324
13 CyPascalTriangleConstantLists Cython 0.053835 0.700229 42.810452
14 PyPascalTriangleNonRecursive Python 0.054130 0.704068 42.031730
15 PyPascalTriangleNonRecursiveLessCLike Python 0.055244 0.718556 39.167925
16 PyPascalTriangleNonRecursiveCLikeImproved Python 0.064518 0.839183 19.163513
17 PyPascalTriangleNonRecursiveIterators Python 0.065121 0.847029 18.059735
18 PyPascalTriangleConstantLists Python 0.071139 0.925304 8.072619
19 PyPascalTriangleIterators Python 0.071390 0.928566 7.692924
20 PyPascalTriangleOriginalPlus Python 0.073988 0.962359 3.911345
21 PyPascalTriangleNonRecursiveCLike Python 0.074031 0.962920 3.850774
22 PyPascalTriangleOriginal Python 0.076882 1.000000 0.000000
23 PyPascalTriangleNonRecursiveArray Python 0.092010 1.196771 -16.441835
24 PyPascalTriangleConstantTuples Python 0.108509 1.411371 -29.146882
Each implementation.build(67) called for 1000 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 CPascalTriangleULong C 0.004018 0.017983 5460.716829
2 CyPascalTriangleNonRecursive Cython 0.132844 0.594593 68.182373
3 CyPascalTriangleNonRecursiveLessCLike Cython 0.136494 0.610929 63.685050
4 CyPascalTriangleNonRecursiveIterators Cython 0.145995 0.653455 53.032839
5 CyPascalTriangleNonRecursiveCLikeImproved Cython 0.149224 0.667908 49.721277
6 CyPascalTriangleIterators Cython 0.151895 0.679863 47.088499
7 CyPascalTriangleNonRecursiveCLikeMoreImproved Cython 0.163172 0.730337 36.923084
8 CyPascalTriangleConstantLists Cython 0.167823 0.751154 33.128569
9 CyPascalTriangleNonRecursiveCTypesULong Cython 0.178073 0.797033 25.465361
10 PyPascalTriangleNonRecursiveLessCLike Python 0.185789 0.831567 20.254859
11 PyPascalTriangleNonRecursive Python 0.186182 0.833327 20.000922
12 PyPascalTriangleNonRecursiveIterators Python 0.196337 0.878779 13.794221
13 PyPascalTriangleIterators Python 0.211882 0.948357 5.445494
14 PyPascalTriangleConstantLists Python 0.220266 0.985882 1.432032
15 PyPascalTriangleOriginal Python 0.223420 1.000000 0.000000
16 PyPascalTriangleOriginalPlus Python 0.224271 1.003808 -0.379308
17 PyPascalTriangleNonRecursiveCLikeImproved Python 0.234445 1.049345 -4.702476
18 PyPascalTriangleNonRecursiveCLike Python 0.270704 1.211637 -17.467005
19 PyPascalTriangleConstantTuples Python 0.322150 1.441902 -30.647168
Each implementation.build(900) called for 5 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 CyPascalTriangleNonRecursiveIterators Cython 0.140652 0.651107 53.584705
2 CyPascalTriangleIterators Cython 0.141067 0.653028 53.132785
3 CyPascalTriangleNonRecursiveLessCLike Cython 0.164642 0.762161 31.205923
4 CyPascalTriangleNonRecursive Cython 0.169332 0.783872 27.571773
5 CyPascalTriangleConstantLists Cython 0.170383 0.788738 26.784890
6 PyPascalTriangleNonRecursiveIterators Python 0.186486 0.863282 15.837040
7 PyPascalTriangleIterators Python 0.186617 0.863888 15.755793
8 CyPascalTriangleNonRecursiveCLikeImproved Cython 0.194658 0.901112 10.974028
9 PyPascalTriangleOriginalPlus Python 0.213122 0.986585 1.359775
10 PyPascalTriangleOriginal Python 0.216020 1.000000 0.000000
11 PyPascalTriangleConstantLists Python 0.217825 1.008356 -0.828678
12 PyPascalTriangleNonRecursiveLessCLike Python 0.219190 1.014675 -1.446241
13 CyPascalTriangleNonRecursiveCLikeMoreImproved Cython 0.220848 1.022351 -2.186215
14 PyPascalTriangleNonRecursive Python 0.222758 1.031192 -3.024890
15 PyPascalTriangleNonRecursiveCLikeImproved Python 0.273294 1.265133 -20.956958
16 PyPascalTriangleConstantTuples Python 0.284513 1.317069 -24.073813
17 PyPascalTriangleNonRecursiveCLike Python 0.310638 1.438006 -30.459280
Each implementation.build(3000) called for 5 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 CyPascalTriangleNonRecursiveIterators Cython 2.492651
2 CyPascalTriangleNonRecursive Cython 2.849843
3 CyPascalTriangleNonRecursiveLessCLike Cython 2.858547
4 PyPascalTriangleNonRecursiveIterators Python 3.019004
5 CyPascalTriangleNonRecursiveCLikeImproved Cython 3.216158
6 CyPascalTriangleNonRecursiveCLikeMoreImproved Cython 3.429079
7 PyPascalTriangleNonRecursiveLessCLike Python 3.462906
8 PyPascalTriangleNonRecursive Python 3.464945
9 PyPascalTriangleNonRecursiveCLikeImproved Python 4.003433
10 PyPascalTriangleNonRecursiveCLike Python 4.414496

PyPy

Each implementation.build(34) called for 1000 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 PyPascalTriangleNonRecursiveCLike Python 0.009965 0.255112 291.985071
2 PyPascalTriangleNonRecursiveLessCLike Python 0.011276 0.288669 246.417169
3 PyPascalTriangleNonRecursive Python 0.011767 0.301237 231.964988
4 PyPascalTriangleNonRecursiveCLikeImproved Python 0.014874 0.380779 162.619819
5 PyPascalTriangleNonRecursiveArray Python 0.018903 0.483923 106.644384
6 PyPascalTriangleConstantLists Python 0.033238 0.850908 17.521573
7 PyPascalTriangleOriginalPlus Python 0.038394 0.982898 1.739984
8 PyPascalTriangleOriginal Python 0.039062 1.000000 0.000000
9 PyPascalTriangleNonRecursiveIterators Python 0.041938 1.073628 -6.857835
10 PyPascalTriangleConstantTuples Python 0.050661 1.296940 -22.895411
11 PyPascalTriangleIterators Python 0.057787 1.479370 -32.403652
Each implementation.build(67) called for 1000 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 PyPascalTriangleNonRecursiveCLikeImproved Python 0.021272 0.340252 193.899487
2 PyPascalTriangleNonRecursiveLessCLike Python 0.021738 0.347700 187.604058
3 PyPascalTriangleNonRecursiveCLike Python 0.022324 0.357078 180.051050
4 PyPascalTriangleNonRecursive Python 0.023544 0.376592 165.539589
5 PyPascalTriangleConstantLists Python 0.061327 0.980936 1.943442
6 PyPascalTriangleOriginalPlus Python 0.062375 0.997700 0.230487
7 PyPascalTriangleOriginal Python 0.062519 1.000000 0.000000
8 PyPascalTriangleNonRecursiveIterators Python 0.076089 1.217060 -17.834750
9 PyPascalTriangleConstantTuples Python 0.101974 1.631093 -38.691403
10 PyPascalTriangleIterators Python 0.103806 1.660396 -39.773401
Each implementation.build(900) called for 5 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 PyPascalTriangleNonRecursiveLessCLike Python 0.124087 0.269902 270.505017
2 PyPascalTriangleNonRecursive Python 0.127564 0.277465 260.405912
3 PyPascalTriangleNonRecursiveCLikeImproved Python 0.129059 0.280717 256.230684
4 PyPascalTriangleNonRecursiveCLike Python 0.134702 0.292991 241.307584
5 PyPascalTriangleNonRecursiveIterators Python 0.206257 0.448630 122.900715
6 PyPascalTriangleConstantLists Python 0.358617 0.780030 28.200265
7 PyPascalTriangleOriginalPlus Python 0.362363 0.788177 26.875041
8 PyPascalTriangleIterators Python 0.364880 0.793652 25.999760
9 PyPascalTriangleConstantTuples Python 0.374680 0.814968 22.704246
10 PyPascalTriangleOriginal Python 0.459748 1.000000 0.000000
Each implementation.build(3000) called for 5 times (ordered by performance)
# Implementation Language Duration, secs Fraction of (*) duration % faster than (*)
1 PyPascalTriangleNonRecursiveCLikeImproved Python 4.349007
2 PyPascalTriangleNonRecursiveLessCLike Python 4.362742
3 PyPascalTriangleNonRecursive Python 4.366922
4 PyPascalTriangleNonRecursiveCLike Python 4.501581
5 PyPascalTriangleNonRecursiveIterators Python 5.453328

CPython vs Cython vs PyPy

# Implementation CPython, s Cython, s / % faster PyPy, s / % faster
1 PyPascalTriangleNonRecursive 0.054124 0.040927 / +32.24 0.011875 / +355.78
2 PyPascalTriangleNonRecursiveLessCLike 0.054172 0.040580 / +33.49 0.011593 / +367.28
3 PyPascalTriangleNonRecursiveCLikeImproved 0.065096 0.043069 / +51.14 0.014917 / +336.39
4 PyPascalTriangleNonRecursiveIterators 0.065820 0.049754 / +32.29 0.036597 / +79.85
5 PyPascalTriangleIterators 0.072989 0.053422 / +36.63 0.059628 / +22.41
6 PyPascalTriangleOriginalPlus 0.073143
0.038411 / +90.42
7 PyPascalTriangleNonRecursiveCLike 0.073623
0.009877 / +645.40
8 PyPascalTriangleConstantLists 0.074328 0.054876 / +35.45 0.033842 / +119.63
9 PyPascalTriangleOriginal 0.075939
0.039253 / +93.46
10 PyPascalTriangleNonRecursiveArray 0.092238
0.018299 / +404.06
11 PyPascalTriangleConstantTuples 0.105243
0.049954 / +110.68

Pure C implementations

  • Most optimal version is more than 200 times faster than original CPython version
  • Compiler-optimized (-O3 and -Ofast options) C code runs faster than Assembler code
  • Python types conversion required for C extensions come with up to 10% overhead on function call
  • gcc optimization options may improve run time up to 3 times
Compilation script:
#!/usr/bin/env bash

set -x

gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -o ccpascal_triangle_normal ./pascal_triangle/ccpascal_triangle.c
chmod a+x ./ccpascal_triangle_normal

gcc -O2 -o ccpascal_triangle_O2 ./pascal_triangle/ccpascal_triangle.c
chmod a+x ./ccpascal_triangle_O2

gcc -O3 -o ccpascal_triangle_O3 ./pascal_triangle/ccpascal_triangle.c
chmod a+x ./ccpascal_triangle_O3

gcc -Ofast -o ccpascal_triangle_Ofast ./pascal_triangle/ccpascal_triangle.c
chmod a+x ./ccpascal_triangle_Ofast
Output of $ ./ccpascal_triangle_normal
0.001524 seconds: 1000 times: c_print_pascal_triangle(34)
0.001165 seconds: 1000 times: c_print_pascal_triangle_ulong(34)
0.000761 seconds: 1000 times: c_print_pascal_triangle_full_asm_implementation(34)
Output of $ ./ccpascal_triangle_O2
0.001530 seconds: 1000 times: c_print_pascal_triangle(34)
0.001279 seconds: 1000 times: c_print_pascal_triangle_ulong(34)
0.000578 seconds: 1000 times: c_print_pascal_triangle_full_asm_implementation(34)
Output of $ ./ccpascal_triangle_O3
0.000290 seconds: 1000 times: c_print_pascal_triangle(34)
0.000340 seconds: 1000 times: c_print_pascal_triangle_ulong(34)
0.000480 seconds: 1000 times: c_print_pascal_triangle_full_asm_implementation(34)
Output of $ ./ccpascal_triangle_Ofast
0.000291 seconds: 1000 times: c_print_pascal_triangle(34)
0.000352 seconds: 1000 times: c_print_pascal_triangle_ulong(34)
0.000538 seconds: 1000 times: c_print_pascal_triangle_full_asm_implementation(34)

Environment

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 69
Stepping:              1
CPU MHz:               2601.000
BogoMIPS:              5188.20
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              3072K
NUMA node0 CPU(s):     0-3
  current CPU frequency is 2.60 GHz.
  current CPU frequency is 2.60 GHz.
  current CPU frequency is 2.60 GHz.
  current CPU frequency is 2.60 GHz.
Linux 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Linux Mint 17.2 Rafaela \n \l
Python 2.7.6
Cython version 0.23.4

gcc (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Source code

Please, see class’s docstring and diff to review the difference with base implementation.

PascalTriangleBase

class PascalTriangleBase(object):

    language = NotImplemented
    original = False
    max_height = None

    EMPTY_LIST = []
    ZERO_LIST = [0]
    ONE_LIST = [1]

    EMPTY_TUPLE = ()
    ZERO_TUPLE = (0,)
    ONE_TUPLE = (1,)

    def __init__(self, verbose=False, return_list=False):
        self.verbose = verbose
        self.return_list = return_list

    def _test_print(self, arg):
        print arg

    def _print(self, arg):
        """
        Print Pascal's Triangle line
        :param arg: triangle line
        """
        if not self.verbose:
            return

        if not isinstance(arg, list):
            arg = list(arg)

        self._test_print(arg)

    def build(self, height):
        """Build Pascal's Triangle of given height.
        :param height: height of the triangle to print (zero-based)
        :return: last line of Pascal's Triangle
        """
        raise NotImplementedError()

PyPascalTriangleBase

from pascal_triangle.implementations.base import PascalTriangleBase


class PyPascalTriangleBase(PascalTriangleBase):

    language = 'Python'

PyPascalTriangleOriginal

from .base import PyPascalTriangleBase


class PyPascalTriangleOriginal(PyPascalTriangleBase):
    """
    Modified original solution presented on interview.
    """

    original = True
    max_height = 900

    def build(self, height):
        if height <= 0:
            self._print([1])
            return [1]

        a = [0] + self.build(height - 1) + [0]
        b = [a[i] + a[i + 1] for i in xrange(len(a) - 1)]
        self._print(b)
        return b  # return last line for testing purposes

PyPascalTriangleOriginalPlus

from .base import PyPascalTriangleBase


class PyPascalTriangleOriginalPlus(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleOriginal`.

    Difference:
        - Better naming of local variables
        - Fixed corner cases
    """

    max_height = 900

    def build(self, height):
        if height == 0:
            self._print([1])
            return [1]
        elif height < 0:
            return []

        prev_line = [0] + self.build(height - 1) + [0]
        line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
        self._print(line)
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/original.py
+++ ../implementations/python/original_plus.py
@@ -1,20 +1,25 @@
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleOriginal(PyPascalTriangleBase):
+class PyPascalTriangleOriginalPlus(PyPascalTriangleBase):
     """
-    Modified original solution presented on interview.
+    Based on :py:class::`PascalTriangleOriginal`.
+
+    Difference:
+        - Better naming of local variables
+        - Fixed corner cases
     """
 
-    original = True
     max_height = 900
 
     def build(self, height):
-        if height <= 0:
+        if height == 0:
             self._print([1])
             return [1]
+        elif height < 0:
+            return []
 
-        a = [0] + self.build(height - 1) + [0]
-        b = [a[i] + a[i + 1] for i in xrange(len(a) - 1)]
-        self._print(b)
-        return b  # return last line for testing purposes
+        prev_line = [0] + self.build(height - 1) + [0]
+        line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
+        self._print(line)
+        return line  # return last line for testing purposes

PyPascalTriangleConstantLists

from .base import PyPascalTriangleBase


class PyPascalTriangleConstantLists(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleOriginalPlus`.

    Difference:
        - Constant lists replaced with named constants
    """

    max_height = 900

    def build(self, height):
        if height == 0:
            self._print(self.ONE_LIST)
            return self.ONE_LIST
        elif height < 0:
            return self.EMPTY_LIST

        prev_line = self.ZERO_LIST + self.build(height - 1) + self.ZERO_LIST
        line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
        self._print(line)
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/original_plus.py
+++ ../implementations/python/constant_lists.py
@@ -1,25 +1,24 @@
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleOriginalPlus(PyPascalTriangleBase):
+class PyPascalTriangleConstantLists(PyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleOriginal`.
+    Based on :py:class::`PascalTriangleOriginalPlus`.
 
     Difference:
-        - Better naming of local variables
-        - Fixed corner cases
+        - Constant lists replaced with named constants
     """
 
     max_height = 900
 
     def build(self, height):
         if height == 0:
-            self._print([1])
-            return [1]
+            self._print(self.ONE_LIST)
+            return self.ONE_LIST
         elif height < 0:
-            return []
+            return self.EMPTY_LIST
 
-        prev_line = [0] + self.build(height - 1) + [0]
+        prev_line = self.ZERO_LIST + self.build(height - 1) + self.ZERO_LIST
         line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
         self._print(line)
         return line  # return last line for testing purposes

PyPascalTriangleConstantTuples

from .base import PyPascalTriangleBase


class PyPascalTriangleConstantTuples(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleConstantLists`.

    Difference:
        - Lists replaced with tuples
    """

    max_height = 900

    def build(self, height):
        if height == 0:
            self._print(self.ONE_TUPLE)
            return self.ONE_TUPLE
        elif height < 0:
            return self.EMPTY_TUPLE

        prev_line = self.ZERO_TUPLE + self.build(height - 1) + self.ZERO_TUPLE
        line = tuple(prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1))
        self._print(line)
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/constant_lists.py
+++ ../implementations/python/constant_tuples.py
@@ -1,24 +1,24 @@
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleConstantLists(PyPascalTriangleBase):
+class PyPascalTriangleConstantTuples(PyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleOriginalPlus`.
+    Based on :py:class::`PascalTriangleConstantLists`.
 
     Difference:
-        - Constant lists replaced with named constants
+        - Lists replaced with tuples
     """
 
     max_height = 900
 
     def build(self, height):
         if height == 0:
-            self._print(self.ONE_LIST)
-            return self.ONE_LIST
+            self._print(self.ONE_TUPLE)
+            return self.ONE_TUPLE
         elif height < 0:
-            return self.EMPTY_LIST
+            return self.EMPTY_TUPLE
 
-        prev_line = self.ZERO_LIST + self.build(height - 1) + self.ZERO_LIST
-        line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
+        prev_line = self.ZERO_TUPLE + self.build(height - 1) + self.ZERO_TUPLE
+        line = tuple(prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1))
         self._print(line)
         return line  # return last line for testing purposes

PyPascalTriangleIterators

import itertools

from .base import PyPascalTriangleBase


class PyPascalTriangleIterators(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleConstantLists`.

    Difference:
        - Iterators are used instead of lists where it is possible
    """

    max_height = 900

    def build(self, height):
        if height == 0:
            self._print(self.ONE_LIST)
            return self.ONE_LIST
        elif height < 0:
            return self.EMPTY_LIST

        prev_line = self.build(height - 1)
        iterator = itertools.chain(self.ZERO_LIST, prev_line)
        ahead_iterator = itertools.chain(prev_line, self.ZERO_LIST)
        line = [x + y for x, y in itertools.izip(iterator, ahead_iterator)]
        self._print(line)
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/constant_lists.py
+++ ../implementations/python/iterators.py
@@ -1,12 +1,14 @@
+import itertools
+
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleConstantLists(PyPascalTriangleBase):
+class PyPascalTriangleIterators(PyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleOriginalPlus`.
+    Based on :py:class::`PascalTriangleConstantLists`.
 
     Difference:
-        - Constant lists replaced with named constants
+        - Iterators are used instead of lists where it is possible
     """
 
     max_height = 900
@@ -18,7 +20,9 @@
         elif height < 0:
             return self.EMPTY_LIST
 
-        prev_line = self.ZERO_LIST + self.build(height - 1) + self.ZERO_LIST
-        line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
+        prev_line = self.build(height - 1)
+        iterator = itertools.chain(self.ZERO_LIST, prev_line)
+        ahead_iterator = itertools.chain(prev_line, self.ZERO_LIST)
+        line = [x + y for x, y in itertools.izip(iterator, ahead_iterator)]
         self._print(line)
         return line  # return last line for testing purposes

PyPascalTriangleNonRecursive

from .base import PyPascalTriangleBase


class PyPascalTriangleNonRecursive(PyPascalTriangleBase):
    """
    A non-recursive implementation of Pascal Triangle printing algorithm.
    """

    def build(self, height):
        line = []
        for x in xrange(height + 1):
            line = self.ONE_LIST + line
            for i in xrange(1, x):
                line[i] += line[i + 1]
            self._print(line)

        return line  # return last line for testing purposes

PyPascalTriangleNonRecursiveIterators

import itertools

from .base import PyPascalTriangleBase


class PyPascalTriangleNonRecursiveIterators(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleNonRecursive`.

    Difference:
        - iterators are used instead of lists
    """

    def build(self, height):
        if height < 0:
            return self.EMPTY_LIST

        line = self.ONE_LIST
        self._print(line)
        for _ in xrange(height):
            iterator = itertools.chain(self.ZERO_LIST, line)
            ahead_iterator = itertools.chain(line, self.ZERO_LIST)
            line = [x + y for x, y in itertools.izip(iterator, ahead_iterator)]
            self._print(line)

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive.py
+++ ../implementations/python/non_recursive_iterators.py
@@ -1,17 +1,26 @@
+import itertools
+
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursive(PyPascalTriangleBase):
+class PyPascalTriangleNonRecursiveIterators(PyPascalTriangleBase):
     """
-    A non-recursive implementation of Pascal Triangle printing algorithm.
+    Based on :py:class::`PascalTriangleNonRecursive`.
+
+    Difference:
+        - iterators are used instead of lists
     """
 
     def build(self, height):
-        line = []
-        for x in xrange(height + 1):
-            line = self.ONE_LIST + line
-            for i in xrange(1, x):
-                line[i] += line[i + 1]
+        if height < 0:
+            return self.EMPTY_LIST
+
+        line = self.ONE_LIST
+        self._print(line)
+        for _ in xrange(height):
+            iterator = itertools.chain(self.ZERO_LIST, line)
+            ahead_iterator = itertools.chain(line, self.ZERO_LIST)
+            line = [x + y for x, y in itertools.izip(iterator, ahead_iterator)]
             self._print(line)
 
         return line  # return last line for testing purposes

PyPascalTriangleNonRecursiveCLike

from .base import PyPascalTriangleBase


class PyPascalTriangleNonRecursiveCLike(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleNonRecursive`.

    Difference:
        - All required memory is allocated at once
        - List concatenations are replaced with value assignment by index
    """

    def build(self, height):
        line = self.ZERO_LIST * (height + 1)
        start = height
        size = 0
        while size <= height:
            line[start] = 1
            i = 1
            while i < size:
                index = start + i
                line[index] += line[index + 1]
                i += 1
            print_line = line[start:start + size + 1]
            self._print(print_line)
            start -= 1
            size += 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive.py
+++ ../implementations/python/non_recursive_c_like.py
@@ -1,17 +1,29 @@
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursive(PyPascalTriangleBase):
+class PyPascalTriangleNonRecursiveCLike(PyPascalTriangleBase):
     """
-    A non-recursive implementation of Pascal Triangle printing algorithm.
+    Based on :py:class::`PascalTriangleNonRecursive`.
+
+    Difference:
+        - All required memory is allocated at once
+        - List concatenations are replaced with value assignment by index
     """
 
     def build(self, height):
-        line = []
-        for x in xrange(height + 1):
-            line = self.ONE_LIST + line
-            for i in xrange(1, x):
-                line[i] += line[i + 1]
-            self._print(line)
+        line = self.ZERO_LIST * (height + 1)
+        start = height
+        size = 0
+        while size <= height:
+            line[start] = 1
+            i = 1
+            while i < size:
+                index = start + i
+                line[index] += line[index + 1]
+                i += 1
+            print_line = line[start:start + size + 1]
+            self._print(print_line)
+            start -= 1
+            size += 1
 
         return line  # return last line for testing purposes

PyPascalTriangleNonRecursiveCLikeImproved

from .base import PyPascalTriangleBase


class PyPascalTriangleNonRecursiveCLikeImproved(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleNonRecursiveCLike`.

    Difference:
        - Removed some unneeded computations
    """

    def build(self, height):
        line = self.ONE_LIST * (height + 1)
        start = height
        size = 0
        while size <= height:
            index = start + 1
            while index < height:
                line[index] += line[index + 1]
                index += 1
            print_line = line[start:start + size + 1]
            self._print(print_line)
            start -= 1
            size += 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_c_like.py
+++ ../implementations/python/non_recursive_c_like_improved.py
@@ -1,26 +1,23 @@
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveCLike(PyPascalTriangleBase):
+class PyPascalTriangleNonRecursiveCLikeImproved(PyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursive`.
+    Based on :py:class::`PascalTriangleNonRecursiveCLike`.
 
     Difference:
-        - All required memory is allocated at once
-        - List concatenations are replaced with value assignment by index
+        - Removed some unneeded computations
     """
 
     def build(self, height):
-        line = self.ZERO_LIST * (height + 1)
+        line = self.ONE_LIST * (height + 1)
         start = height
         size = 0
         while size <= height:
-            line[start] = 1
-            i = 1
-            while i < size:
-                index = start + i
+            index = start + 1
+            while index < height:
                 line[index] += line[index + 1]
-                i += 1
+                index += 1
             print_line = line[start:start + size + 1]
             self._print(print_line)
             start -= 1

PyPascalTriangleNonRecursiveLessCLike

from .base import PyPascalTriangleBase


class PyPascalTriangleNonRecursiveLessCLike(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleNonRecursiveCLike`.

    Difference:
        - xrange objects are used instead of while-loops
    """

    def build(self, height):
        line = self.ONE_LIST * (height + 1)
        start = height
        for size in xrange(height + 1):
            for index in xrange(start + 1, height):
                line[index] += line[index + 1]
            print_line = line[start:]
            self._print(print_line)
            start -= 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_c_like.py
+++ ../implementations/python/non_recursive_less_c_like.py
@@ -1,29 +1,22 @@
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveCLike(PyPascalTriangleBase):
+class PyPascalTriangleNonRecursiveLessCLike(PyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursive`.
+    Based on :py:class::`PascalTriangleNonRecursiveCLike`.
 
     Difference:
-        - All required memory is allocated at once
-        - List concatenations are replaced with value assignment by index
+        - xrange objects are used instead of while-loops
     """
 
     def build(self, height):
-        line = self.ZERO_LIST * (height + 1)
+        line = self.ONE_LIST * (height + 1)
         start = height
-        size = 0
-        while size <= height:
-            line[start] = 1
-            i = 1
-            while i < size:
-                index = start + i
+        for size in xrange(height + 1):
+            for index in xrange(start + 1, height):
                 line[index] += line[index + 1]
-                i += 1
-            print_line = line[start:start + size + 1]
+            print_line = line[start:]
             self._print(print_line)
             start -= 1
-            size += 1
 
         return line  # return last line for testing purposes

PyPascalTriangleNonRecursiveArray

import itertools
from array import array

from .base import PyPascalTriangleBase


class PyPascalTriangleNonRecursiveArray(PyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleNonRecursiveLessCLike`.

    Difference:
        - array object is used instead of list
    """

    max_height = 34

    def build(self, height):
        line = array('I', itertools.repeat(1, height + 1))
        start = height
        for size in xrange(height + 1):
            for index in xrange(start + 1, height):
                line[index] += line[index + 1]
            print_line = line[start:]
            self._print(print_line)
            start -= 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_less_c_like.py
+++ ../implementations/python/non_recursive_array.py
@@ -1,16 +1,21 @@
+import itertools
+from array import array
+
 from .base import PyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveLessCLike(PyPascalTriangleBase):
+class PyPascalTriangleNonRecursiveArray(PyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursiveCLike`.
+    Based on :py:class::`PascalTriangleNonRecursiveLessCLike`.
 
     Difference:
-        - xrange objects are used instead of while-loops
+        - array object is used instead of list
     """
 
+    max_height = 34
+
     def build(self, height):
-        line = self.ONE_LIST * (height + 1)
+        line = array('I', itertools.repeat(1, height + 1))
         start = height
         for size in xrange(height + 1):
             for index in xrange(start + 1, height):

CyPascalTriangleBase

from pascal_triangle.implementations.base import PascalTriangleBase


class CyPascalTriangleBase(PascalTriangleBase):

    language = 'Cython'

CyPascalTriangleConstantLists

from .base import CyPascalTriangleBase


class CyPascalTriangleConstantLists(CyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleConstantLists`.

    Difference:
        - Compiled with Cython
    """

    max_height = 900

    def build(self, height):
        if height == 0:
            self._print(self.ONE_LIST)
            return self.ONE_LIST
        elif height < 0:
            return self.EMPTY_LIST

        prev_line = self.ZERO_LIST + self.build(height - 1) + self.ZERO_LIST
        line = [prev_line[i] + prev_line[i + 1] for i in xrange(len(prev_line) - 1)]
        self._print(line)
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/constant_lists.py
+++ ../implementations/cython/constant_lists.pyx
@@ -1,12 +1,12 @@
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleConstantLists(PyPascalTriangleBase):
+class CyPascalTriangleConstantLists(CyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleOriginalPlus`.
+    Based on :py:class::`PascalTriangleConstantLists`.
 
     Difference:
-        - Constant lists replaced with named constants
+        - Compiled with Cython
     """
 
     max_height = 900

CyPascalTriangleIterators

import itertools

from .base import CyPascalTriangleBase


class CyPascalTriangleIterators(CyPascalTriangleBase):
    """
    Based on :py:class::`PascalTriangleIterators`.

    Difference:
        - Compiled with Cython
    """

    max_height = 900

    def build(self, height):
        if height == 0:
            self._print(self.ONE_LIST)
            return self.ONE_LIST
        elif height < 0:
            return self.EMPTY_LIST

        prev_line = self.build(height - 1)
        iterator = itertools.chain(self.ZERO_LIST, prev_line)
        ahead_iterator = itertools.chain(prev_line, self.ZERO_LIST)
        line = [x + y for x, y in itertools.izip(iterator, ahead_iterator)]
        self._print(line)
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/iterators.py
+++ ../implementations/cython/iterators.pyx
@@ -1,14 +1,14 @@
 import itertools
 
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleIterators(PyPascalTriangleBase):
+class CyPascalTriangleIterators(CyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleConstantLists`.
+    Based on :py:class::`PascalTriangleIterators`.
 
     Difference:
-        - Iterators are used instead of lists where it is possible
+        - Compiled with Cython
     """
 
     max_height = 900

CyPascalTriangleNonRecursive

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursive(CyPascalTriangleBase):
    """
    Based on :py:class::`PyPascalTriangleNonRecursive`.

    Difference:
        - Compiled with Cython
    """

    def build(self, height):
        line = []
        for x in xrange(height + 1):
            line = self.ONE_LIST + line
            for i in xrange(1, x):
                line[i] += line[i + 1]
            self._print(line)

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive.py
+++ ../implementations/cython/non_recursive.pyx
@@ -1,9 +1,12 @@
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursive(PyPascalTriangleBase):
+class CyPascalTriangleNonRecursive(CyPascalTriangleBase):
     """
-    A non-recursive implementation of Pascal Triangle printing algorithm.
+    Based on :py:class::`PyPascalTriangleNonRecursive`.
+
+    Difference:
+        - Compiled with Cython
     """
 
     def build(self, height):

CyPascalTriangleNonRecursiveIterators

import itertools

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursiveIterators(CyPascalTriangleBase):
    """
    Based on :py:class::`PyPascalTriangleNonRecursiveIterators`.

    Difference:
        - Compiled with Cython
    """

    def build(self, height):
        if height < 0:
            return self.EMPTY_LIST

        line = self.ONE_LIST
        self._print(line)
        for _ in xrange(height):
            iterator = itertools.chain(self.ZERO_LIST, line)
            ahead_iterator = itertools.chain(line, self.ZERO_LIST)
            line = [x + y for x, y in itertools.izip(iterator, ahead_iterator)]
            self._print(line)

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_iterators.py
+++ ../implementations/cython/non_recursive_iterators.pyx
@@ -1,14 +1,14 @@
 import itertools
 
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveIterators(PyPascalTriangleBase):
+class CyPascalTriangleNonRecursiveIterators(CyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursive`.
+    Based on :py:class::`PyPascalTriangleNonRecursiveIterators`.
 
     Difference:
-        - iterators are used instead of lists
+        - Compiled with Cython
     """
 
     def build(self, height):

CyPascalTriangleNonRecursiveCLikeImproved

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursiveCLikeImproved(CyPascalTriangleBase):
    """
    Based on :py:class::`PyPascalTriangleNonRecursiveCLikeImproved`.

    Difference:
        - Compiled with Cython
    """

    def build(self, height):
        line = self.ONE_LIST * (height + 1)
        start = height
        size = 0
        while size <= height:
            index = start + 1
            while index < height:
                line[index] += line[index + 1]
                index += 1
            print_line = line[start:start + size + 1]
            self._print(print_line)
            start -= 1
            size += 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_c_like_improved.py
+++ ../implementations/cython/non_recursive_c_like_improved.pyx
@@ -1,12 +1,12 @@
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveCLikeImproved(PyPascalTriangleBase):
+class CyPascalTriangleNonRecursiveCLikeImproved(CyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursiveCLike`.
+    Based on :py:class::`PyPascalTriangleNonRecursiveCLikeImproved`.
 
     Difference:
-        - Removed some unneeded computations
+        - Compiled with Cython
     """
 
     def build(self, height):

CyPascalTriangleNonRecursiveCLikeMoreImproved

import sys

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursiveCLikeMoreImproved(CyPascalTriangleBase):
    """
    Based on :py:class::`CyPascalTriangleNonRecursiveCLikeImproved`.

    Difference:
        - List slicing replaced with character by character output
    """

    def build(self, height, verbose=False):
        line = self.ONE_LIST * (height + 1)
        start = height
        size = 0
        while size <= height:
            index = start + 1
            while index < height:
                line[index] += line[index + 1]
                index += 1

            i = start
            finish = start + size + 1
            while i < finish:
                if verbose:
                    sys.stdout.write(str(line[i]) + " ")
                i += 1
            if verbose:
                print

            start -= 1
            size += 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/cython/non_recursive_c_like_improved.pyx
+++ ../implementations/cython/non_recursive_c_like_more_improved.pyx
@@ -1,15 +1,17 @@
+import sys
+
 from .base import CyPascalTriangleBase
 
 
-class CyPascalTriangleNonRecursiveCLikeImproved(CyPascalTriangleBase):
+class CyPascalTriangleNonRecursiveCLikeMoreImproved(CyPascalTriangleBase):
     """
-    Based on :py:class::`PyPascalTriangleNonRecursiveCLikeImproved`.
+    Based on :py:class::`CyPascalTriangleNonRecursiveCLikeImproved`.
 
     Difference:
-        - Compiled with Cython
+        - List slicing replaced with character by character output
     """
 
-    def build(self, height):
+    def build(self, height, verbose=False):
         line = self.ONE_LIST * (height + 1)
         start = height
         size = 0
@@ -18,8 +20,16 @@
             while index < height:
                 line[index] += line[index + 1]
                 index += 1
-            print_line = line[start:start + size + 1]
-            self._print(print_line)
+
+            i = start
+            finish = start + size + 1
+            while i < finish:
+                if verbose:
+                    sys.stdout.write(str(line[i]) + " ")
+                i += 1
+            if verbose:
+                print
+
             start -= 1
             size += 1
 

CyPascalTriangleNonRecursiveLessCLike

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursiveLessCLike(CyPascalTriangleBase):
    """
    Based on :py:class::`PyPascalTriangleNonRecursiveLessCLike`.

    Difference:
        - Compiled with Cython
    """

    def build(self, height):
        line = self.ONE_LIST * (height + 1)
        start = height
        for size in xrange(height + 1):
            for index in xrange(start + 1, height):
                line[index] += line[index + 1]
            print_line = line[start:]
            self._print(print_line)
            start -= 1

        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_less_c_like.py
+++ ../implementations/cython/non_recursive_less_c_like.pyx
@@ -1,12 +1,12 @@
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveLessCLike(PyPascalTriangleBase):
+class CyPascalTriangleNonRecursiveLessCLike(CyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursiveCLike`.
+    Based on :py:class::`PyPascalTriangleNonRecursiveLessCLike`.
 
     Difference:
-        - xrange objects are used instead of while-loops
+        - Compiled with Cython
     """
 
     def build(self, height):

CyPascalTriangleNonRecursiveCTypes

import sys

import itertools
from array import array
from cpython cimport array as c_array

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursiveCTypes(CyPascalTriangleBase):
    """
    Based on :py:class::`PyPascalTriangleNonRecursiveArray`.

    Difference:
        - Compiled with Cython
        - C type are used instead of Python types
    """

    max_height = 34

    def build(self, height, verbose=False):
        cdef c_array.array line = array('I', itertools.repeat(1, height + 1))
        cdef int start = height
        cdef int size = 0
        cdef int index
        cdef int i
        cdef int finish
        while size <= height:
            index = start + 1
            while index < height:
                line[index] += line[index + 1]
                index += 1

            i = start
            finish = start + size + 1
            while i < finish:
                if verbose:
                    sys.stdout.write(str(line[i]) + ' ')
                i += 1
            if verbose:
                print

            start -= 1
            size += 1
        return line  # return last line for testing purposes
Diff
--- ../implementations/python/non_recursive_array.py
+++ ../implementations/cython/c_types.pyx
@@ -1,27 +1,45 @@
+import sys
+
 import itertools
 from array import array
+from cpython cimport array as c_array
 
-from .base import PyPascalTriangleBase
+from .base import CyPascalTriangleBase
 
 
-class PyPascalTriangleNonRecursiveArray(PyPascalTriangleBase):
+class CyPascalTriangleNonRecursiveCTypes(CyPascalTriangleBase):
     """
-    Based on :py:class::`PascalTriangleNonRecursiveLessCLike`.
+    Based on :py:class::`PyPascalTriangleNonRecursiveArray`.
 
     Difference:
-        - array object is used instead of list
+        - Compiled with Cython
+        - C type are used instead of Python types
     """
 
     max_height = 34
 
-    def build(self, height):
-        line = array('I', itertools.repeat(1, height + 1))
-        start = height
-        for size in xrange(height + 1):
-            for index in xrange(start + 1, height):
+    def build(self, height, verbose=False):
+        cdef c_array.array line = array('I', itertools.repeat(1, height + 1))
+        cdef int start = height
+        cdef int size = 0
+        cdef int index
+        cdef int i
+        cdef int finish
+        while size <= height:
+            index = start + 1
+            while index < height:
                 line[index] += line[index + 1]
-            print_line = line[start:]
-            self._print(print_line)
+                index += 1
+
+            i = start
+            finish = start + size + 1
+            while i < finish:
+                if verbose:
+                    sys.stdout.write(str(line[i]) + ' ')
+                i += 1
+            if verbose:
+                print
+
             start -= 1
-
+            size += 1
         return line  # return last line for testing purposes

CyPascalTriangleNonRecursiveCTypesULong

import sys

import itertools
from array import array
from cpython cimport array as c_array

from .base import CyPascalTriangleBase


class CyPascalTriangleNonRecursiveCTypesULong(CyPascalTriangleBase):
    """
    Based on :py:class::`CyPascalTriangleNonRecursiveCTypes`.

    Difference:
        - 64-bit integers are used to store values instead of 32-bit integers
    """

    max_height = 67

    def build(self, height, verbose=False):
        cdef c_array.array line = array('L', itertools.repeat(1, height + 1))
        cdef int start = height
        cdef int size = 0
        cdef int index
        cdef int i
        cdef int finish
        while size <= height:
            index = start + 1
            while index < height:
                line[index] += line[index + 1]
                index += 1

            i = start
            finish = start + size + 1
            while i < finish:
                if verbose:
                    sys.stdout.write(str(line[i]) + ' ')
                i += 1
            if verbose:
                print

            start -= 1
            size += 1
        return line  # return last line for testing purposes
Diff
--- ../implementations/cython/c_types.pyx
+++ ../implementations/cython/c_types_ulong.pyx
@@ -7,19 +7,18 @@
 from .base import CyPascalTriangleBase
 
 
-class CyPascalTriangleNonRecursiveCTypes(CyPascalTriangleBase):
+class CyPascalTriangleNonRecursiveCTypesULong(CyPascalTriangleBase):
     """
-    Based on :py:class::`PyPascalTriangleNonRecursiveArray`.
+    Based on :py:class::`CyPascalTriangleNonRecursiveCTypes`.
 
     Difference:
-        - Compiled with Cython
-        - C type are used instead of Python types
+        - 64-bit integers are used to store values instead of 32-bit integers
     """
 
-    max_height = 34
+    max_height = 67
 
     def build(self, height, verbose=False):
-        cdef c_array.array line = array('I', itertools.repeat(1, height + 1))
+        cdef c_array.array line = array('L', itertools.repeat(1, height + 1))
         cdef int start = height
         cdef int size = 0
         cdef int index

CPascalTriangleInitial

from .base import CPascalTriangleBase
from .cinitial import c_pascal_triangle_initial


class CPascalTriangleInitial(CPascalTriangleBase):
    """
    Based on :py:class::`CyPascalTriangleNonRecursiveCTypes`.

    Difference:
        - C implementation
    """

    max_height = 34

    def build(self, height):
        return c_pascal_triangle_initial(height, 0, self.return_list)
#include <Python.h>
#include <stdio.h>

// TODO LOW: Allocate memory on heap using malloc() instead of stack
#define MAX_HEIGHT_UINT 34


static PyObject* c_pascal_triangle_initial(PyObject* self, PyObject* args) {
    int height, verbose=0, return_list=0;
    if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;

    unsigned int line[height + 1];

    int start = height;
    int size;
    int index;

    if(height > MAX_HEIGHT_UINT) {
        PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_UINT);
        return NULL;
    }

    for (size = 0; size <= height; size++) {
        line[start] = 1;
        for (index = start + 1; index < height; index++) {
            line[index] += line[index + 1];
        }

        for (index = start + size; index >= start; index--) {
            if(verbose) printf("%u ", line[index]);
        }
        if(verbose) printf("\n");

        start--;
    }

    if(return_list) {
        PyObject* py_line = PyList_New(height + 1);
        for (index = height; index >= 0; index--) {
            PyList_SetItem(py_line, index, PyInt_FromLong((long) line[index]));
        }
        return py_line;
    } else {
        Py_INCREF(Py_None);
        return Py_None;
    }
}


static PyMethodDef cpascal_triangle_methods[] = {
    {"c_pascal_triangle_initial", c_pascal_triangle_initial, METH_VARARGS},
    {NULL, NULL}
};

void initcinitial(void) {
    (void) Py_InitModule("cinitial", cpascal_triangle_methods);
}

CPascalTriangleULong

from .base import CPascalTriangleBase
from .culong import c_pascal_triangle_ulong


class CPascalTriangleULong(CPascalTriangleBase):
    """
    Based on :py:class::`CPascalTriangleInitial`.

    Difference:
        - Unsigned long is used instead of unsigned int
    """

    max_height = 67

    def build(self, height):
        return c_pascal_triangle_ulong(height, 0, self.return_list)
#include <Python.h>
#include <stdio.h>

// TODO LOW: Allocate memory on heap using malloc() instead of stack

#define MAX_HEIGHT_ULONG 67


static PyObject* c_pascal_triangle_ulong(PyObject* self, PyObject* args) {
    int height, verbose=0, return_list=0;
    if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;

    unsigned long line[height + 1];

    int start = height;
    int size;
    int index;

    if(height > MAX_HEIGHT_ULONG) {
        PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_ULONG);
        return NULL;
    }

    for (size = 0; size <= height; size++) {
        line[start] = 1;
        for (index = start + 1; index < height; index++) {
            line[index] += line[index + 1];
        }

        for (index = start + size; index >= start; index--) {
            if(verbose) printf("%lu ", line[index]);
        }
        if(verbose) printf("\n");

        start--;
    }

    if(return_list) {
        PyObject* py_line = PyList_New(height + 1);
        for (index = height; index >= 0; index--) {
            PyList_SetItem(py_line, index, PyLong_FromUnsignedLong(line[index]));
        }
        return py_line;
    } else {
        Py_INCREF(Py_None);
        return Py_None;
    }
}


static PyMethodDef cpascal_triangle_methods[] = {
    {"c_pascal_triangle_ulong", c_pascal_triangle_ulong, METH_VARARGS},
    {NULL, NULL}
};

void initculong(void) {
    (void) Py_InitModule("culong", cpascal_triangle_methods);
}
Diff
--- ../implementations/cextension/initial.py
+++ ../implementations/cextension/ulong.py
@@ -1,16 +1,16 @@
 from .base import CPascalTriangleBase
-from .cinitial import c_pascal_triangle_initial
+from .culong import c_pascal_triangle_ulong
 
 
-class CPascalTriangleInitial(CPascalTriangleBase):
+class CPascalTriangleULong(CPascalTriangleBase):
     """
-    Based on :py:class::`CyPascalTriangleNonRecursiveCTypes`.
+    Based on :py:class::`CPascalTriangleInitial`.
 
     Difference:
-        - C implementation
+        - Unsigned long is used instead of unsigned int
     """
 
-    max_height = 34
+    max_height = 67
 
     def build(self, height):
-        return c_pascal_triangle_initial(height, 0, self.return_list)
+        return c_pascal_triangle_ulong(height, 0, self.return_list)
--- ../implementations/cextension/cinitial.c
+++ ../implementations/cextension/culong.c
@@ -2,21 +2,22 @@
 #include <stdio.h>
 
 // TODO LOW: Allocate memory on heap using malloc() instead of stack
-#define MAX_HEIGHT_UINT 34
+
+#define MAX_HEIGHT_ULONG 67
 
 
-static PyObject* c_pascal_triangle_initial(PyObject* self, PyObject* args) {
+static PyObject* c_pascal_triangle_ulong(PyObject* self, PyObject* args) {
     int height, verbose=0, return_list=0;
     if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;
 
-    unsigned int line[height + 1];
+    unsigned long line[height + 1];
 
     int start = height;
     int size;
     int index;
 
-    if(height > MAX_HEIGHT_UINT) {
-        PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_UINT);
+    if(height > MAX_HEIGHT_ULONG) {
+        PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_ULONG);
         return NULL;
     }
 
@@ -27,7 +28,7 @@
         }
 
         for (index = start + size; index >= start; index--) {
-            if(verbose) printf("%u ", line[index]);
+            if(verbose) printf("%lu ", line[index]);
         }
         if(verbose) printf("\n");
 
@@ -37,7 +38,7 @@
     if(return_list) {
         PyObject* py_line = PyList_New(height + 1);
         for (index = height; index >= 0; index--) {
-            PyList_SetItem(py_line, index, PyInt_FromLong((long) line[index]));
+            PyList_SetItem(py_line, index, PyLong_FromUnsignedLong(line[index]));
         }
         return py_line;
     } else {
@@ -48,10 +49,10 @@
 
 
 static PyMethodDef cpascal_triangle_methods[] = {
-    {"c_pascal_triangle_initial", c_pascal_triangle_initial, METH_VARARGS},
+    {"c_pascal_triangle_ulong", c_pascal_triangle_ulong, METH_VARARGS},
     {NULL, NULL}
 };
 
-void initcinitial(void) {
-    (void) Py_InitModule("cinitial", cpascal_triangle_methods);
+void initculong(void) {
+    (void) Py_InitModule("culong", cpascal_triangle_methods);
 }

CPascalTrianglePartialAsm

from .base import CPascalTriangleBase
from .cpartial_asm import c_pascal_triangle_partial_asm


class CPascalTrianglePartialAsm(CPascalTriangleBase):
    """
    Based on :py:class::`CPascalTriangleInitial`.

    Difference:
        - Array initialization is implemented in Assembler
    """

    language = 'C/Assembler'
    max_height = 34

    def build(self, height):
        return c_pascal_triangle_partial_asm(height, 0, self.return_list)
#include <Python.h>
#include <stdio.h>

// TODO LOW: Allocate memory on heap using malloc() instead of stack

#define MAX_HEIGHT_UINT 34


static PyObject* c_pascal_triangle_partial_asm(PyObject* self, PyObject* args) {
    int height, verbose=0, return_list=0;
    if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;

    unsigned int line[height + 1];

    int start = height;
    int size;
    int index;

    if(height > MAX_HEIGHT_UINT) {
        PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_UINT);
        return NULL;
    }

    asm (
         // Fill in line with 1 and put 0 in the end
         "movl %0, %%ecx\n\t"
         "movq %1, %%rdi\n\t"
         "incl %%ecx\n\t"
         "movl $1, %%eax\n\t"
         "cld\n\t"
         "rep\n\t"
         "stosl\n\t"
         : /**/
         : "g" (height), "g" (line)
         : "eax", "ecx", "rdi"
        );

    for (size = 0; size <= height; size++) {
        for (index = start + 1; index < height; index++) {
            line[index] += line[index + 1];
        }

        for (index = start + size; index >= start; index--) {
            if(verbose) printf("%u ", line[index]);
        }
        if(verbose) printf("\n");

        start--;
    }

    if(return_list) {
        PyObject* py_line = PyList_New(height + 1);
        for (index = height; index >= 0; index--) {
            PyList_SetItem(py_line, index, PyInt_FromLong((long) line[index]));
        }
        return py_line;
    } else {
        Py_INCREF(Py_None);
        return Py_None;
    }
}


static PyMethodDef cpascal_triangle_methods[] = {
    {"c_pascal_triangle_partial_asm", c_pascal_triangle_partial_asm, METH_VARARGS},
    {NULL, NULL}
};

void initcpartial_asm(void) {
    (void) Py_InitModule("cpartial_asm", cpascal_triangle_methods);
}
Diff
--- ../implementations/cextension/initial.py
+++ ../implementations/cextension/partial_asm.py
@@ -1,16 +1,17 @@
 from .base import CPascalTriangleBase
-from .cinitial import c_pascal_triangle_initial
+from .cpartial_asm import c_pascal_triangle_partial_asm
 
 
-class CPascalTriangleInitial(CPascalTriangleBase):
+class CPascalTrianglePartialAsm(CPascalTriangleBase):
     """
-    Based on :py:class::`CyPascalTriangleNonRecursiveCTypes`.
+    Based on :py:class::`CPascalTriangleInitial`.
 
     Difference:
-        - C implementation
+        - Array initialization is implemented in Assembler
     """
 
+    language = 'C/Assembler'
     max_height = 34
 
     def build(self, height):
-        return c_pascal_triangle_initial(height, 0, self.return_list)
+        return c_pascal_triangle_partial_asm(height, 0, self.return_list)
--- ../implementations/cextension/cinitial.c
+++ ../implementations/cextension/cpartial_asm.c
@@ -2,10 +2,11 @@
 #include <stdio.h>
 
 // TODO LOW: Allocate memory on heap using malloc() instead of stack
+
 #define MAX_HEIGHT_UINT 34
 
 
-static PyObject* c_pascal_triangle_initial(PyObject* self, PyObject* args) {
+static PyObject* c_pascal_triangle_partial_asm(PyObject* self, PyObject* args) {
     int height, verbose=0, return_list=0;
     if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;
 
@@ -20,8 +21,21 @@
         return NULL;
     }
 
+    asm (
+         // Fill in line with 1 and put 0 in the end
+         "movl %0, %%ecx\n\t"
+         "movq %1, %%rdi\n\t"
+         "incl %%ecx\n\t"
+         "movl $1, %%eax\n\t"
+         "cld\n\t"
+         "rep\n\t"
+         "stosl\n\t"
+         : /**/
+         : "g" (height), "g" (line)
+         : "eax", "ecx", "rdi"
+        );
+
     for (size = 0; size <= height; size++) {
-        line[start] = 1;
         for (index = start + 1; index < height; index++) {
             line[index] += line[index + 1];
         }
@@ -48,10 +62,10 @@
 
 
 static PyMethodDef cpascal_triangle_methods[] = {
-    {"c_pascal_triangle_initial", c_pascal_triangle_initial, METH_VARARGS},
+    {"c_pascal_triangle_partial_asm", c_pascal_triangle_partial_asm, METH_VARARGS},
     {NULL, NULL}
 };
 
-void initcinitial(void) {
-    (void) Py_InitModule("cinitial", cpascal_triangle_methods);
+void initcpartial_asm(void) {
+    (void) Py_InitModule("cpartial_asm", cpascal_triangle_methods);
 }

CPascalTriangleFullAsm

from .base import CPascalTriangleBase
from .cfull_asm import c_pascal_triangle_full_asm


class CPascalTriangleFullAsm(CPascalTriangleBase):
    """
    Based on :py:class::`CPascalTrianglePartialAsm`.

    Difference:
        - The algorithm is implemented in Assembler completely
    """

    language = 'C/Assembler'
    max_height = 34

    def build(self, height):
        return c_pascal_triangle_full_asm(height, 0, self.return_list)
#include <Python.h>
#include <stdio.h>

// TODO LOW: Allocate memory on heap using malloc() instead of stack

#define MAX_HEIGHT_UINT 34


static PyObject* c_pascal_triangle_full_asm(PyObject* self, PyObject* args) {
    int height, verbose=0, return_list=0;
    if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;

    // TODO MEDIUM: Make a better fix to support zero-based Pascal's Triangle
    height++;
    unsigned int line[height + 1];
    int index;

    if(height > MAX_HEIGHT_UINT + 1) {
        PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_UINT);
        return NULL;
    }

    asm (
         // Fill in line with 1 and put 0 in the end
         "movl %0, %%ecx\n\t"
         "movq %1, %%rdi\n\t"
         "movl $1, %%eax\n\t"
         "cld\n\t"
         "rep\n\t"
         "stosl\n\t"
         // End of filling
         "movq $0, (%%rdi)\n\t"  // TODO MEDIUM: There is no need for zero-padding
         "movq $1, %%rcx\n\t"  // line number being processed
         "xorq %%rdx, %%rdx\n\t"
         "movl %0, %%edx\n\t"  // maximum number of lines
         "movq %1, %%rdi\n\t"  // start of "line" array
         "outer:\n\t"
             "movq %%rdx, %%rbx\n\t"  // offset of current element
             "subq %%rcx, %%rbx\n\t"
             "xorq %%r8, %%r8\n\t"
             // print %%rcx dwords starting from (%%rdi,%%rbx,4)
             "inner:\n\t"
                 "movl 4(%%rdi,%%rbx,4), %%eax\n\t"  // read line[index + 1]
                 "addl %%eax, (%%rdi,%%rbx,4)\n\t"  // line[index] += line[index + 1]
                 "addq $1, %%rbx\n\t"
                 "incq %%r8\n\t"
                 "cmpq %%r8, %%rcx\n\t"
                 "jne inner\n\t"
             "incq %%rcx\n\t"
             "cmpq %%rdx, %%rcx\n\t"
             "jl outer\n\t"
             // print %%rdx dwords starting from (%%rdi)
         : /**/
         : "g" (height), "g" (line)
         : "eax", "rbx", "rcx", "rdx", "rdi", "r8"
        );

    // Some code that used results to ensure that optimization is not applied and assembly code is actually executing
    if(verbose) {
        for (index = height - 1; index >= 0; index--) {
            if(verbose) printf("%u ", line[index]);
        }
        if(verbose) printf("\n");
    }

    if(return_list) {
        PyObject* py_line = PyList_New(height);
        for (index = height - 1; index >= 0; index--) {
            PyList_SetItem(py_line, index, PyInt_FromLong((long) line[index]));
        }
        return py_line;
    } else {
        Py_INCREF(Py_None);
        return Py_None;
    }
}


static PyMethodDef cpascal_triangle_methods[] = {
    {"c_pascal_triangle_full_asm", c_pascal_triangle_full_asm, METH_VARARGS},
    {NULL, NULL}
};

void initcfull_asm(void) {
    (void) Py_InitModule("cfull_asm", cpascal_triangle_methods);
}
Diff
--- ../implementations/cextension/partial_asm.py
+++ ../implementations/cextension/full_asm.py
@@ -1,17 +1,17 @@
 from .base import CPascalTriangleBase
-from .cpartial_asm import c_pascal_triangle_partial_asm
+from .cfull_asm import c_pascal_triangle_full_asm
 
 
-class CPascalTrianglePartialAsm(CPascalTriangleBase):
+class CPascalTriangleFullAsm(CPascalTriangleBase):
     """
-    Based on :py:class::`CPascalTriangleInitial`.
+    Based on :py:class::`CPascalTrianglePartialAsm`.
 
     Difference:
-        - Array initialization is implemented in Assembler
+        - The algorithm is implemented in Assembler completely
     """
 
     language = 'C/Assembler'
     max_height = 34
 
     def build(self, height):
-        return c_pascal_triangle_partial_asm(height, 0, self.return_list)
+        return c_pascal_triangle_full_asm(height, 0, self.return_list)
--- ../implementations/cextension/cpartial_asm.c
+++ ../implementations/cextension/cfull_asm.c
@@ -6,17 +6,16 @@
 #define MAX_HEIGHT_UINT 34
 
 
-static PyObject* c_pascal_triangle_partial_asm(PyObject* self, PyObject* args) {
+static PyObject* c_pascal_triangle_full_asm(PyObject* self, PyObject* args) {
     int height, verbose=0, return_list=0;
     if (!PyArg_ParseTuple(args, "i|ii", &height, &verbose, &return_list)) return NULL;
 
+    // TODO MEDIUM: Make a better fix to support zero-based Pascal's Triangle
+    height++;
     unsigned int line[height + 1];
-
-    int start = height;
-    int size;
     int index;
 
-    if(height > MAX_HEIGHT_UINT) {
+    if(height > MAX_HEIGHT_UINT + 1) {
         PyErr_Format(PyExc_ValueError, "Unable to build Pascal's Triangle higher than %d", MAX_HEIGHT_UINT);
         return NULL;
     }
@@ -25,32 +24,48 @@
          // Fill in line with 1 and put 0 in the end
          "movl %0, %%ecx\n\t"
          "movq %1, %%rdi\n\t"
-         "incl %%ecx\n\t"
          "movl $1, %%eax\n\t"
          "cld\n\t"
          "rep\n\t"
          "stosl\n\t"
+         // End of filling
+         "movq $0, (%%rdi)\n\t"  // TODO MEDIUM: There is no need for zero-padding
+         "movq $1, %%rcx\n\t"  // line number being processed
+         "xorq %%rdx, %%rdx\n\t"
+         "movl %0, %%edx\n\t"  // maximum number of lines
+         "movq %1, %%rdi\n\t"  // start of "line" array
+         "outer:\n\t"
+             "movq %%rdx, %%rbx\n\t"  // offset of current element
+             "subq %%rcx, %%rbx\n\t"
+             "xorq %%r8, %%r8\n\t"
+             // print %%rcx dwords starting from (%%rdi,%%rbx,4)
+             "inner:\n\t"
+                 "movl 4(%%rdi,%%rbx,4), %%eax\n\t"  // read line[index + 1]
+                 "addl %%eax, (%%rdi,%%rbx,4)\n\t"  // line[index] += line[index + 1]
+                 "addq $1, %%rbx\n\t"
+                 "incq %%r8\n\t"
+                 "cmpq %%r8, %%rcx\n\t"
+                 "jne inner\n\t"
+             "incq %%rcx\n\t"
+             "cmpq %%rdx, %%rcx\n\t"
+             "jl outer\n\t"
+             // print %%rdx dwords starting from (%%rdi)
          : /**/
          : "g" (height), "g" (line)
-         : "eax", "ecx", "rdi"
+         : "eax", "rbx", "rcx", "rdx", "rdi", "r8"
         );
 
-    for (size = 0; size <= height; size++) {
-        for (index = start + 1; index < height; index++) {
-            line[index] += line[index + 1];
-        }
-
-        for (index = start + size; index >= start; index--) {
+    // Some code that used results to ensure that optimization is not applied and assembly code is actually executing
+    if(verbose) {
+        for (index = height - 1; index >= 0; index--) {
             if(verbose) printf("%u ", line[index]);
         }
         if(verbose) printf("\n");
-
-        start--;
     }
 
     if(return_list) {
-        PyObject* py_line = PyList_New(height + 1);
-        for (index = height; index >= 0; index--) {
+        PyObject* py_line = PyList_New(height);
+        for (index = height - 1; index >= 0; index--) {
             PyList_SetItem(py_line, index, PyInt_FromLong((long) line[index]));
         }
         return py_line;
@@ -62,10 +77,10 @@
 
 
 static PyMethodDef cpascal_triangle_methods[] = {
-    {"c_pascal_triangle_partial_asm", c_pascal_triangle_partial_asm, METH_VARARGS},
+    {"c_pascal_triangle_full_asm", c_pascal_triangle_full_asm, METH_VARARGS},
     {NULL, NULL}
 };
 
-void initcpartial_asm(void) {
-    (void) Py_InitModule("cpartial_asm", cpascal_triangle_methods);
+void initcfull_asm(void) {
+    (void) Py_InitModule("cfull_asm", cpascal_triangle_methods);
 }

TODO

  • Provide Assembler x86 implementation for 64-bit integer values
  • Support Python 3.x (port Python, Cython and C extensions code)