Welcome to our round table! Each participant writes one blog post about his or her experiences with distributing scientific software. You are invited to post. More information here.

2011-09-22

Experiences with Python under git Control

I find Dag's idea very straight and brave.  I'm facing pretty much the same problem with Python only (i.e. no C libs, no Fortran libs, whatever).  I do use mpl, though.  Maybe you remember me doing some builds of numpy for the OS X users.  That time I resolved the reproducibility and branchability problem using .. git.  Yes, Python path under git control.  I've put the whole Python installation directory, which is usually /Library/Frameworks/Python.framework/Version/2.x/ under a singleton git control (i.e. making this dir a repo).  This works pretty flawless after some tinkering around.  There are some pitfalls, which can be avoided, and some problems, which can be solved.  The pitfalls will not be covered here.  One of the "major" problems is the .pth file issue, when merging in branches of a different software which is installed via .pth, git sometimes cannot merge the .pth file properly and this needs to be fixed manually.

There is one more major problem where I would need feedback or have a lack of knowledge, that is the speed issue with .pyc being older than the corresponding .py file.  The .pyc files in general are one of the pitfalls of the method.  They are hard to repoduce to look the same as when installed.  And you cannot git-ignore them because .pyc's lingering around can, for instance, disturb nose.  For this problem, which can be ignored at all if load speed does not matter, I didn't find a real solution so far; I simply ignored it and kept the .pyc's in the repo.

I'm going to use this approach for my OS X Lion Python installation with a shared system Python and User-local Python (probably via virtualenv), because I want the System Python to be free of user's software, let it be software interesting to others or not.  I've worked out the strategy for handling mpkg (i.e. point-and-click installers), by fetching from the system Python and deleting the branch in the system Python afterwards.  Anyway, this is planned, not yet tested.  Tested is only the strategy for handling and maintaining the system Python, and this I was used to use on a regular basis.

Of course, it would be possible to wrap git in a Python-specific application, but that would hinder portability to the C, Cython, Fortran etc. folks amongst us.

I must add that I don't have experience with most of the packaging software that is around except the Python distutils & Bento (where I clearly prefer the latter).

To my experience there will not be the solution.  World is not that simple as we scientists would like it to be.  It's like a zoo out there [from a well-known movie].  Nothing will ever help this.  Life would be boring if everything would be standardised and there would be only 1 standard.  In my opinion, a discussion serves as .. discussion on its own, i.e. propelling the mind by developing new ideas. Standards might be a side-effect of a good democratic evolution.  So this round table might have perfect conditions when it brings together the different parties, which were separated in mind before.

No comments:

Post a Comment