Apr 22, 2017

Refactor me

This repository represents a step by step refactoring of a dirty code given me as a test task to estimate my coding skills. The only remark about the code was: "refactor_me.py is expected to contain Python 3.5.x code" (actually file naming was not provided in the task).

I did it in a way that every commit contains one particular change described in the commit message. The original dirty code can be found in this commit: 1036c091cb70ef110b4e56702bdc012c8a110336

Remarks on final result:

  • 100 character line length limit is used on purpose


Please, do not hesitate to submit pull requests for improvements if you feel that I missed something.

Apr 16, 2017

My repo stats

My prospective clients and employers often ask me to show some code before agreeing to work with me. Unfortunately (or fortunately), most of the code I have wrote has been written professionally therefore the code is covered by NDA and closed sourced. I can not disclose the code, but it is OK to publish some stats to shed light on what I have done in the past as a software developer.

Lamoda / Senior Developer

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
apilib SQLAlchemy Apr, 2013 - May, 2015 40,695 18,738 46% +32,063 / -12,891 30,002 (74%)
apigateway Spyne Apr, 2013 - May, 2015 5,768 2,725 47% +6,464 / -4,830 4,631 (80%)

Saprun / Team Leader / Architect

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
NDA_core Django, Celery Jun, 2015 - Aug, 2016 43,972 7,223 16% +26,497 / -31,412 36,819 (84%)
NDA_communication Autobahn / Crossbar Aug, 2015 - Jul, 2016 1,657 589 36% - 779 (47%)

Diamondmine / Senior Developer

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
diamondmine-server Django Jul, 2016 - Oct, 2016 5,393 5,204 96% +6,284 / -775 1,801 (33%)
diamondmine-processor Celery Jul, 2016 - Sep, 2016 1,689 1,689 100% +2,501 / -812 1,367 (81%)

Semilimes / Senior Developer

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
NDA Flask Nov, 2016 - Dec, 2016 21,466 2,595 12% +4,589 / -4029 11,506 (54%)

Trounceflow / Team Leader

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
website Django Dec, 2016 - Mar, 2017 22,212 2,797 13% +13,195 / -11,123 19,035 (86%)

Acura Capital / Senior Developer

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
NDA-poller gevent Jan, 2017 - Apr, 2017 6,012 6,012 100% +18,441 / -12429 5,381 (90%)
NDA-bidder gevent Mar, 2017 - Apr, 2017 1,730 1,730 100% +2,366 / -636 1,272 (74%)
NDA-common - Mar, 2017 - Apr, 2017 1,991 1,991 100% +2,280 / -289 1,737 (87%)

Open source projects

Repository Frameworks Period Total lines My lines at the latest commit My contribution My lines + / - Python lines
pascal_triangle - Apr, 2015 - Apr, 2017 2,840 2,840 100% +10,546 / -7,706 1,239 (44%)
dmu-utils - Mar, 2017 - Apr, 2017 372 372 100% +379 / -7 261 (70%)

Mar 25, 2017

HackerRank stats

My current HackerRank stats:
Contest (World) Contest (Russia) Practice (World) Practice (Russia)
Top 10% 30% 10% -
Percentile 91.47 71.08 92 -
Rank 10 247 (out of 120 108) 504 (out of 1 743) 3 049 (out of 757 851) 82 (out of 7 748)

Jan 7, 2017

Definition of Done

This is a sample definition of done that I use in mature development process environments.

Task is considered done if all of the below conditions are met:

  1. Source code that corresponds to the task description has been developed
  2. Unit tests for the task that test the source code are have been developed
  3. All unit tests (including developed for the task) pass successfully
  4. New features are covered by integration test
  5. Integration test passes successfully
  6. Changes that describe installation and migration process related to the task are provided in corresponding documentation or deployment scripts
  7. Change log is updated
  8. Pull request has passed code review (all comments are either covered by fixes or somehow resolved with responsible reviewer) and merged to upstream repository
  9. Notes for QA people are added to the task
  10. Task deployment provided for test environment
  11. The task has passed manual testing successfully: discovered bugs are either fixed or brought out to separate issues for later fixing
  12. The task has been deployed to live, it does not expose bugs and shows correct behavior in live


Dec 15, 2016

Strict dependencies

UPDATE 2019-02-21: It seems that pipenv solves the problem completely.

Use strict version dependencies to prevent unexpected upgrades. Apply the same rule to the dependencies of your dependencies recursively. Example: dependency-package-name==x.y.z

What happens when you do not follow the above rule? Let us see a development cycle on a time line.

Bob is a developer who does not follow the rule above. When he needs to add a new dependecy he just adds it to setup.py or requirements.txt without specifying a version. Then he uses pip install -e . or pip install -r requirements.txt to install the dependecy. Since an exact version of a dependecy is not specified pip installs the latest available version of the dependecy at the moment. Everything work fine and Bob happily continues a development and uses the dependecy in his code. The dependency will not be upgraded when new version is release, because pip does not upgrade by default. It checks for the presence of the dependency only, not its version if a version is not specified.

Time passes, say a month or two or a year, and at some point Alice joins the team to help Bob with the development. Alice sets up her development enviroment and runs pip install -e . or pip install -r requirements.txt to install all dependecies. Since exact versions of dependecies are not specified pip installs the latest available versions of the dependecies at the moment of installation. Modern development cycles are short and releases are frequent, so it is very luckily that Alice gets newer versions that Bob has.

First consequence of the events is that Bob and Alice start developing in different environments which is bad, because they may experience different behavior of the same dependecies but of different versions. Something that works for Alice would not work for Bob and vice verse. It leads to ineffective loss of time investigating this "magical" behavior.

Another consequence is that a newer version of a depency may become backward incompatible intentionally, by error or of improper usage. I had several such cases in my experience. In this case the program will fail with an exception or run with a logical error which is worse. Alice will still need to contact Bob and ask him to pip freeze to learn what is a working version of a dependcy to install it exactly.

The worst case is then a working combination of dependencies is lost. For example Bob has left the company before Alice joined the company. In this case Alice will need to downgrade a dependecy or a combination of dependecies version by version to find a working version.

The same happens when a production environment should be deployed. What versions of dependencies should installed if developers set up their environments a half year ago? Which developer's environment represents a master set of dependency versions?

One more thing that happens when dependecies are not strict. It prevents managed dependecies upgrades. Upgrades happen randomly along with setting up new environments which may lead to unplanned work because of backward incompatibilities or need for upgrading own code along with dependecies.

My other Python development practices

Dec 7, 2016

My Python software development practices

UPDATE ON 2022-11-12

This post is continuously updated list of software development practices that I use every day developing software in Python. They include coding conventions, software design principles and software project management principles.

Well-known conventions and practices

Extensions, exceptions and customization to well-known conventions and practices from the previous section

Python-specific

  • Python code maximum line length is 100 or 120 characters
  • Preferred order of attributes within a class:
    • NAMED_CONSTANTS
    • class data attributes
    • __init__() if present
    • __magic_methods__()
    • @properties
    • @staticmethods
    • @classmethods
    • _private methods
    • public methods (related private and public methods should be placed together for better readability)
  • When overriding a method always call an overridden parent class method (with super) and return its result (if you are not altering it) even if the overridden method is not supposed to return anything then None at the moment (it is a forward compatibility measure)
  • Usage of introspection (e.g. getattr(), hasattr()) and __magic_methods__ override should be well-considered. If a feature can be implemented without introspection or explicit __magic_methods__ it should implemented with out them (with very rare exceptions). Usage of introspection or explicit __magic_methods__ usually is a sign of bad code design or "reinvention of the wheel"
  • Use strict version dependencies to prevent unexpected upgrades. Apply the same rule to the dependencies of your dependencies recursively. Example: dependency-package-name==x.y.z (explanation) → Use poetry
  • Use Python packaging instead of requirements.txt and git-based deploys → Use poetry

Development Process

  • Collective code ownership
  • Code review is a good investment of your time
  • Favor code development performance (productivity) over code run-time performance (with reasonable trade-offs)
  • Favor good enough code and code development performance over perfect code when you have tight deadlines. You can always improve later
  • It is fine to write an optimal code in the first place (in terms of size, performance or resource usage) if it comes at zero or near zero cost (we should not deoptimize code on purpose to avoid being accused in premature optimization).
  • Dedicate time on code run-time performance optimization only if it is really needed. Premature optimization may result in waste of time
  • Every time you put a dirty hack or just something that can be done better into your code put a TODO near it with an explanation and/or description of actions for improvement. It will help to track technical debt and help other developers during refactoring or regular development to understand if they are right to judge this code as strange and to be improved
  • If it is hard to choose between alternative technical solutions, then choose any of them instead of wasting time trying to make the right choice. Later when you have more information or circumstances change you can refactor if the original choice was wrong
  • Using a language feature just to show others that you know the feature is unprofessional
  • Manage to see more lines at the screen at the same time for lesser defect count (vertical orientation of display and meaningless lines elimination may help)
  • Keep development environment close to production as much as possible
  • Keep testing environment identical to production environment (including the deployment procedure)
  • Never submit changes that are known to break something that already works
  • Never remove something that you do not understand or because you do not understand it

Code Design

  • Write code to be read by humans
  • Favor code readability over code run-time performance (with reasonable trade-offs)
  • Maintain code reuse of own code and reuse code from publicly available libraries. Reinventing the wheel is a waste of time
  • Avoid code copy & paste unless you have very strong reasons for it
  • Know the difference between agile and universal: instead of writing a code for all imaginary future use cases write the code that can be easily adapted to many of the future use cases (also known as maintainable code)
  • Maintain the least possible cyclomatic complexity of the code for better readability and smaller defect count
  • Favor a readable code over a well commented code. A necessity for comment is good sign of a poor readability of the code
  • Consider writing a logging message instead of a comment. It will serve two purposes
  • Code run-time performance optimization should start from identifying bottlenecks and their elimination instead of something that is easy to optimize
  • If a literal used twice in the code then prefer putting it to a named constant. Even literals that occur only once deserve to be put in a named constant, because named constant will describe the nature of the literal
  • Do not put .gitignore into git repository. Every developer should be free to ignore whatever extra file he/she has locally (.gitignore should not aggregate every developer's local "mess") → Use .git/info/exclude for developer specific ignores

Code style

  • Maintain a shorter code size in number of lines and characters for better readability and lesser defect count unless it impacts performance or readability
  • TODO format: TODO(author) <LOW | MEDIUM | HIGH | CRITICAL>: textual description of todo
  • A longer variable, function, class or method name is better than an unclear or ambiguous name
  • Code style should be consistent across the entire code base
  • Code base should not contain commented out source code unless it is a part of a comment or TODO.
  • Non-ASCII characters should not be a part of the source code. Presence of such characters is a sign of poor localization, internationalization or parameterization of the code 

Organizational (may be boring for some developers)

  • Every decision should be reasonable (based on reasons)
  • Actual responsibility for a decision always lays on the person(s) who made the decision independent of how it cooked or formalized
  • Every decision should be considered in every longevity term (short, medium and long)
  • Every technical decision should be made considering its effect on business as the most important criteria
  • Technical solutions of a highest efficiency (= effect / expenses) should be chosen during the decision making process
  • Honor established processes, rules, methodologies and patterns, but do not hesitate to step out if it is beneficial in terms of profit and loss