5

I realize this is a somewhat vague question, but, I have a python script that needs to run for two years on a raspberry pi and is failing after about 3 hours. Without getting into to much detail as to what the script does (I'm not sure the script itself is at fault), what is interesting is that the process appears to stop dead in its tracks, i.e., no warnings, errors or failures are generated when it fails, the process just stops and breaks my terminal session, i.e., I can't enter any more commands when it happens. The process also disappears from the list of processes the pi is running (from the top command).

Anybody have any idea what might be going on? Is there any reason the script would just stop after some time? I'm more than happy to post extensive details about what the script does if need be, I just thought it might have something more to do with how it's interacting with the OS.

This is how I am running the script:

python animation.py &

Running a Model B+ 512MB, connected to internet via WIFI, powering the PI via USB

UPDATE:

I tried running the script from my Mac, the same thing happened about 3 hours in. This time, the program didn't disappear from the process list, it remained in a sleeping state and it's CPU usage dropped down to 0%, while the screen I was watching the the stdout on seemed to be frozen. I am doing some serial communication with the script, is it possible it's getting hung up on a response?

Collin Schupman
  • 77
  • 1
  • 1
  • 3

3 Answers3

10

Not quite an answer but a guess, since this is a pretty vague question.

I'm presuming something you are starting with the intent of having it run for years is also intended to outlast the login session which started it -- unless you start it via the init system, which you don't refer to in the question.

If/when you are starting it from a login (including ssh), simply back-grounding something is not sturdy enough. You also have to take care of a few things:

  • Making sure the process is properly re-parented by init.
  • Cutting off standard input and output streams, if you aren't otherwise redirecting those.

So,

setsid python animation.py < /dev/zero &> /dev/null &

See man setsid -- this ensures the forked process will be re-parented by init. The other stuff is input/output redirection (the output you probably actually want to send to a log instead of /dev/null).

If this doesn't solve your problem, and/or you want a way to monitor the process over a long period of time, have a look at plog.

goldilocks
  • 60,325
  • 17
  • 117
  • 234
2

Just some knowledge that I ran by a few weeks ago. I have seen people with similar problems when it comes to creating a hydroponics system. It turned out that there were variables that were being incremented that went past the memory allocation for its type. When you run a program for an extended period of time, I have seen that in a few cases, this can be the issue. I think (as a work-around) they used the "long long" type or an "unsigned int" so that the increment values can store a larger number before it crashes.

Justin C.
  • 47
  • 5
1

Run your code with a profiler and debugger. Something in the script or how its set up is causing the script to fail.

My best guess is a memory leak or a variable overflowing.

Especially with C code involved, my mind jumps to memory leaks. It is easy to forget who is supposed to free memory that comes out of a function. And automatic garbage collectors are not perfect.

Variables overflowing is a distinct possibility. Why are you increasing a variable ever 30 seconds or so? Wouldn't it be easier just to calculate the value when you need it?

Posting large code has been done before. Unless you post it, anything we say is just a guess.

NomadMaker
  • 1,580
  • 10
  • 10