Productivity Sync

February 7, 2010

repo hacking and python reverse engineering

Filed under: Uncategorized — markgross @ 7:52 pm

Adding a format-patch feature to the Android repo program

repo is the SCM tool for the google android project its basically a git try aggregator written in python.  Its ok, but when porting Android to a new platform you may want to generate patchsets of your changes of a well defined baseline.  The android project has a way of defininging baselines.  The manifest XML files contain a listing of all the projects, git tree paths, and optionally sha1 git object hashes for the defined version.  For instance there is a file eclair-20091115.xml that defines what the exact code base is for the November 15 2009 posting of the eclare code base.

It is useful to be able to extract the patch-sets from the port and distribute the enabling as a small patch set.  Hence the need for a format-patch feature.

The rest of this port is a combination of reverse engineering tricks and documentation of how the repo program is cobbled together.  Its mostly for me so I can remember what I did, but more importantly how I figured it out enough to make it mostly work.  (and what tripped me up)

  • Find __main__, grep -r __main__
  • see _Repo class, and its _Run function.
  • look closely at _Repo.__init__() what’s that all_commands all about/
  • oh, all_commands gets imported from subcmds!  Why, thats a directory with a __init__.py file.
  • looking at subcmds/__init__.py we see an itteration over all the *.py files in that directory, that fills a dictionary “all” with classes defined (with the proper naming convention WRT the *.py filename…)
  • These subcmd classes need to each be a subclass of Command, and include an Execute function, to be called by _Repo._Run()

Now to find out how code was getting called I stooped to sprinkling print commands, pdb, and ipython loading of selected parts of the program.  I’m sure there must be a better way of doing this sort of thing but this is what I did.

  • first looked at how similar repo commands worked.  (like repo diff)
  • grep for diff, see project.py has hits.  /me takes a closer look.
  • Also, recall how subcmds work, take a look at subcmds/diff.py

At this point I should point out the ctags -R * works for python programs.  you want the tags when brousing the code.

  • ooh, see PrintWorkTreeDiff() in project.py its a function in the Project class.
  • At this point I want to know what are the members of the Project class instances.  How do get that data, (ipython is my friend…)
  • Two ways to go at this point pdb and print out selected arguments that get passed to constructors we care about.
  • stick import pdb and pdb.set_trace() in the Execute function in the subcmds/diff.py file
  • use bt, up, and p to see arguments passed into run command.
  • also see that the _Repo() class instance is created by passing the path to the .repo directory.

Now lets go to ipyton and do some poking around.

  • cd the .repo/repo and start ipython.
  • import main
  • repo = main._Repo(‘blaba/.repo’)
  • see that repo._run(argv) sets up a cmd from the dictionay all_commands, lets look at one of those guys
  • diff = repo.commands['diff']
  • lets look at what’s in this guy…
  • see that GetProjects() basically returns the self.manifest.projects in a list.  Looking around we see that the command.manifest is setup in _Run()
  • lets look at that, cmd.manifest = XmlManifest(cmd.repodir)
  • I now know I need one of these for the Baseline.  Lets subclass the XmlManifest do create my baseline manifest so I can get all access to all the goodies in each of the projects in that list.
  • but first lest look at what’s in a project.
  • man = main.XmlManifest(repo.repodir)
  • print man.projects.  Hmm its a dict.  with keys from the xml file.  lets look at one to the project  ojiects…
  • p = man.projects['GAID/platform/packages/apps/Sync']
  • p. tab and look around at what we have.
  • lets try p.PrintWorkTreeDiff()

Ok, at this point I have a good bead on how this guy works.

To do my feature I need to add a function to Project, to do git format-patch, add a subclass to XmlMainfest for the baseline manifest files, and a subcmd/format-patch.py with a FormatPatch subclass of Command.

So working with python comes down to a combonation of reading code, using pdb, ipython and perhaps some print’s in the code.

Nothing too magical, but these are what I needed to do.

Gotcha: when using subprocess, be sure to make every argument not have any spaces!  I got wrapped up with subprocess.call(['git','format-patch','--output-directory /home', rev]) only to finally figure out that the the 3rd parrameter should be the 3rd and 4th ones.  one trick is to use strace to see what the list should be:

strace -e trace=execve git format-patch –output-directory /home/mgross to see what should be passed in.

Handy bdb commands: help, n, up, down, s, l, p

its not too good at introspection.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress