1

At my company, we use a distributed pool of virtual machines to run our UI and API tests. These machines are all connected to an onsite server which the pool uses for publishing the test results and outputs.

The problem is our storage on this server is limited, and each day we are producing 500MB - 5GB of reports (csvs, screenshots, txt logs, etc). We would like to preserve these reports for assisting QA in identifying issues, but we end up having to routinely delete large amounts of reports due to the need to free up space.

Recently, we have moved our test scripts and inputs to a Git repo on VSTS. This not only frees up some space on our test server, but also allows for source control.

We want to do the same with the test outputs. The only issue is that the repo for this would be MASSIVE, larger than the tiny local storage allotted to each test machine. And since everything I've found online seems to suggest that each machine would need to have a full copy of the repo in order to push to it, this solution is unworkable.

My question is, how can I go about making this work? Is there a way to push an individual file or collection of files to a VSTS repo without cloning it locally first? I've looked at Git Submodules but I'm unsure at how reliable or stable that would be, since, in order to get this repo to a reasonable size, we would need about 1,500 submodules. Is there a better solution for storing large amounts of test output data?

1 Answers1

5

I am going to re-phrase your problem slightly.

We would like to capture the test results and store them for later review.

I believe that you should be trying to capture these test results as files with associated metadata, where the metadata allows you to search for and review collections of files based on a number of criteria.

Content Management Is Key

Right now, you are encoding your metadata in the file name, the directory hierarchy, the source server type, and the file system information. If you make your metadata explicit, using document content models, you could then store your test data in a content management system.

A content management system does not have to be a commercial product. For instance, you could use Apache JackRabbit as your API for handling the metadata, along with conventional file system to store the actual files. Open source products, such as Alfresco, use JackRabbit under the hood to provide a useful web-based GUI for queries and reports.

With a good metadata model, it would be possible to issue search queries such as: How often did test "ABC-123" fail in the last six months? or Provide all the test files associated with release 4.2.1.

Git is Not a Fit

A software-configuration management repository is not a good fit. You will still need to encode your metadata in many awkward and idiosyncratic ways. You will also be left with the problem of aging out your data.

BobDalgleish
  • 4,749