11

We are designing a system based on independent microservices (connected via a RabbitMq bus). The code will (for the first components at least) be written in python (both python2 and python3). We have already a monolith application implementing some of the business logic, which we want to refactor as microservices, and extend. One question that worries me is:

What is the best way to share code between the different microservices. We have common helper functions (data processing, logging, configuration parsing, etc), which must be used by several microservices.

The microservices themselves are going to be developed as separate projects (git repositories). The common libraries can be developed as a self-contained project too. How do I share these libraries between the microservices?

I see several approaches:

  • copy around the version of the library that is needed for each microservice, and update as needed
  • release the common libraries to an internal PyPi, and list those libraries as dependencies in the microservice's requirements
  • include the library repository as a git submodule

I would like to read a bit more about suggested approaches, best practices, past experiences before deciding on how to proceed. Do you have any suggestion or link?

blueFast
  • 213

4 Answers4

7

Your second option is definitely the way to go. Break out common libraries and install them onto your local PyPi server.

Option 1 is horrendous because improvements to the libraries will be difficult to propagate to others who could use it.

Option 3 is similar to option 1.

The common pattern is to setup Jenkins so that when you push to master of a library repo, it does a python build and uploads it automatically to the PyPi repo. Once you write this build script, you'll never have to worry about packaging libraries and uploading them manually to PyPi. And with this option, all library updates will be instantly available to be possibly upgraded into other microservices.

Setting up your own PyPi server is very easy. I like this guide

Tommy
  • 263
2

Not a Python guy but the PyPi server seems the best option. A quick googling gives the appearance that it's analogous to having a Nexus repo for the team's Java jars.

Really as long as it's being deployed to some sort of central repository (to the office/team) that your dependency management tool of choice can work with (read from and deploy to) then it is a good option.

Option 1 is really the worst, you should never have to manually deal with dependencies. It's a pain. In college before I knew about Maven and when I thought Git was too complicated we did everything manually, from merging everyone's code to setting up classpaths, to grabbing dependencies. It was a pain, I would seriously not want anyone to go through even a fraction of that trouble, especially in a work environment where efficiency is important.

Option 3 would probably work fine, but it doesn't have any real benefits over a local PyPi (other than maybe being easier to set up, but the benefits of a real dependency management system are just so much better).

1

First of all splitting a monolith into microservices is always going to be hard. See Decentralized Data Management - encapsulating databases into microservices for an idea of why.

That said, there are several recipes for how to do it relatively sanely. One of them is http://12factor.net/. That one would say that you should maintain each library and application independently, then manage dependencies explicitly. If you go that route then I'm going to STRONGLY recommend that you have a simple command that updates all dependencies to whatever is current, and you run it regularly for each micro-service. It is important to have a sane release process where you lock down versions of libraries in production. However you really, really, really don't want to be in a position where dependencies get stale and you don't know what is out there.

Also focus on making your backing libraries as tight and focused as possible. There will always be a natural pull to start adding stuff to core libraries for easy sharing. Do that, and you'll rapidly pull the whole ball of existing spaghetti into shared libraries and effectively get back to the mess that you have now. It is therefore better to over-correct the other way.

btilly
  • 18,340
0

You should be able to go serverless by pointing directly from a Python package dependency file to the private GitHub repos containing the libraries. Pipenv and Poet both support this, I believe.