You are not logged in Log in Join
You are here: Home » Members » Tres Seaver's Zope.org Site » Various Projects » HighlyAvailableZope » UnfairLinuxThreadsAndPythonOhMy » wikipage_view

Log in
Name

Password

 
 

UnfairLinuxThreadsAndPythonOhMy

From a post to comp.lang.python, 20 April 1999, by NeilSchemenaur?:

I think I might have found part of the problem. My Debian Linux system has glibc-2.1.3 which includes LinuxThreads? 0.7. From the LinuxThreads? FAQ:

-- D.6: Scheduling seems to be very unfair when there is strong contention on a mutex: instead of giving the mutex to each thread in turn, it seems that it's almost always the same thread that gets the mutex. Isn't this completely broken behavior?

-- What happens is the following: when a thread unlocks a mutex, all other threads that were waiting on the mutex are sent a signal which makes them runnable. However, the kernel scheduler may or may not restart them immediately. If the thread that unlocked the mutex tries to lock it again immediately afterwards, it is likely that it will succeed, because the threads haven't yet restarted. This results in an apparently very unfair behavior, when the same thread repeatedly locks and unlocks the mutex, while other threads can't lock the mutex.

-- This is perfectly acceptable behavior with respect to the POSIX standard: for the default scheduling policy, POSIX makes no guarantees of fairness, such as "the thread waiting for the mutex for the longest time always acquires it first". This allows implementations of mutexes to remain simple and efficient. Properly written multithreaded code avoids that kind of heavy contention on mutexes, and does not run into fairness problems. If you need scheduling guarantees, you should consider using the real-time scheduling policies SCHED_RR and SCHED_FIFO, which have precisely defined scheduling behaviors.

Threaded Python contends heavily for a few mutexes. Adding sched_yield() to a few strategic places seems to improve things a lot but I don't know if it is the proper solution. Does anyone else know better? LinuxThreads? 0.8 is supposed to be more fair.

I think the attached code shows the problem (or maybe I just don't understand threads at all :). On my uniprocessor machine I get about four stars before the new thread seems to stop running:

  ======================================================================
  import thread
  import os
  import sys

  def run():
      while 1:
          if os.fork() == 0:
              sys.stderr.write('*')
              break
          os.wait()

  thread.start_new_thread(run, ())
  while 1:
      pass

====================================================================

additional comments by Tony Rossignol (mailto:[email protected]) 2000-04-25

Background:

We are running three Linux RedHat? servers as our Zope server farm. Two of these servers have kernel 2.2.12 w/ glibc 2.1.2 and the third has kernel 2.2.5 w/ glibc 2.0.7. All three are dual PentiumIII? servers, varying in speed from 400-500Mhz. The machine with the older kernel is the slowest box.

Both servers with the newer kernel/glibc experience unexplained restarts; server 3, the slower/older system does not experience these restarts. Frequently server 3 will remain up for 24 hours (we have nightly restarts when a clone ZODB is copied over).

Results:

Running Neil's script (from above) on the various servers resulted in the following: servers 1&2 resulted in between 2 to 20 stars being printed before the new thread seemed to stop running. CPU was totally being eaten up by the python process; server 3 printed stars until the process was killed.

Meaning:

I don't know. But this is the first solid example I've seen that illustrates the observed differences between our servers, and offers some indication as to cause.