четверг, 27 ноября 2008 г.

Python logging in threaded application

As I said in previous post, I use Python logging module in my WEB server to log via syslog. I use CherryPy as an application server. There is a single Python process as a backend. CherryPy instanse is configured to run about 20 threads to serve the requests.

Using Python profiler I found out that threads spend more than half of their time blocked on a lock inside the logging module:

logging/__init__.py

class Handler(Filterer):
.................
def acquire(self):
"""
Acquire the I/O thread lock.
"""
if self.lock:
self.lock.acquire()
..................
def handle(self, record):
"""
Conditionally emit the specified logging record.

Emission depends on filters which may have been added to the handler.
Wrap the actual emission of the record with acquisition/release of
the I/O thread lock. Returns whether the filter passed the record for
emission.
"""
rv = self.filter(record)
if rv:
self.acquire()
try:
self.emit(record)
finally:
self.release()
return rv
................

As you can see, the lock is taken on every log message. Handler instance is created as a singleton for the application. Bad thing happens: all the threads lock on a single lock and perform an IO operation holding the lock!

What I've done so solve this: I created my own SmartSysLogHander class which uses a separate socket per thread to write to /dev/log. It does not contain lock a all.

Strange thing that such a thing happens using standard library for logging. I guess standard library is designed to be rock solid, not rocket speed. As Guido Van Rossum said, if you dont like threads and GIL in Python, just spawn several Python processes of your scalable application.

Discussion is welcome.

UnicodeEncodeError in Python logging

I use Python logging facility to write my logs via syslog. But when trying to log some Unicode message, emit method somewhere deep in logging throws UnicodeEncodeError. This happens because it tries to send Unicode string to a socket.

I googled it around and found no solution. File handlers support encoding parameters, but others do not.

The simplest way I found to fix this is to use custom formatter:

from logging import Formatter

class Utf8LogFormatter(Formatter):
def format(self, record):
return Formatter.format(self, record).encode('utf8')

Is this a Python logging system problem or am I doing something wrong?