Recently I was debugging a strange error when using the eventlet support in the
new the new coverage 4.0
alpha. The issue
manifested itself as a network connection problem: Name or service not known
.
This was confusing, since the host it was trying to connect to was localhost
.
How can it fail to resolve localhost
?! Switching off the eventlet tracing,
the problem went away.
After banging my head against this for a few days, I finally remembered a tool
a rarely think to pull out: strace
.
There's an excellent blog post showing the basics of strace
by Chad Fowler,
The Magic of Strace.
After tracing my test process, I could easily search the output for my error
message:
11045 write(2, "2014-10-15 09:16:48,348 [ERROR] py2neo.packages.httpstream.http: !!! NetworkAddressError: Name or service not known", 127) = 127
and a few lines above lay the solution to my mystery:
11045 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = -1 EMFILE (Too many open files)
It turns out the eventlet tracer was causing my code to leak file descriptors,
(a problem I'm still investigating), eventually hitting my relatively low
ulimit
. Bumping the limit in /etc/security/limits.conf
, the problem
disappeared!
I must remember to reach for strace
sooner when trying to debug odd system
behaviours.
Comments !