суббота, 28 ноября 2009 г.

ssh hangup when working via router

I've discovered that my ssh connection occasionally hangs up when I am working through my WIFI router. And ssh works fine when the PC is connected directly to WAN.

This might happen because router drops the connection from its NAT table due to inactivity. To aid this, edit (or create) ~/.ssh/config and add there few lines:

Host *
ServerAliveInterval 60

пятница, 16 октября 2009 г.

Flash policy server

Flash player uses policy server to check its permission to open sockets to certain ports of certain server.

Adobe provides sample Flash policy server. But it is unusable for production. It creates a thread per connection. Also it shows strange virtual memory usage.

That is why I wrote simple flashpolicytwistd using Python/Twisted.

суббота, 12 сентября 2009 г.

Postgresql's huge disadvantage

When building a database which has a big deal of ranged queries, it might be extremely helpful to have clustered index support.

Let say, you keep a table of messages. To query last 10 at inbox it takes about 1 disk seek in worst case with clustered on index by (receiver, timestamp). When there is no clustering, be ready to issue 10 disk seeks.

InnoDB and MS SQL Server both have clustered index support. Instead, Postgresql provides CLUSTER command, which must be explicitly issued to rebuild internal database structure to cluster rows according to specified index. In order to keep you DB more or less clustered, you have to cron the CLUSTER command.

But:

1) CLUSTER takes exclusive lock on table. It took 2 hours to cluster my 3 GB of data. Daily cron would render my application to have 10% downtime. Nice. You can try to aid this by using pg_reorg.

2) Clustering does not change any logical data, only physical storage layout. Nevertheless, it generates the amount of WAL equal to the size of the data. Again, daily CLUSTER would add 3 GB of backup traffic. Same with pg_reorg.

All this makes clustered indexing in Postgresql unusable.

воскресенье, 6 сентября 2009 г.

Postgresql tuple/row

Q: What is the difference between tuple and row in Postgresql?
A: Tuple is a some version of a row.

понедельник, 17 августа 2009 г.

Serving static with nginx and varnish

I used nginx as reverse-proxy in front of amazon s3.

A month ago I decided to try varnish. It is designed from the ground up as a reverse-proxy. Also, I though that nginx solution wasted a lot of resources when keeping lots of tiny images in separate files.

But, after month of experiments, I discovered high iowait values and severe load on the hard disk, causing service problems. I rolled back to the previous nginx static scheme. Iowait dropped from frightening 100-150 to acceptable 25.

I used varnish 2.0.4 running with 3GB file storage. It consumed 0.5-1 GB of memory. Does anyone have a clue why varnish performed so much worser than nginx?

понедельник, 20 июля 2009 г.

Twisted logging pitfall

I run my Twisted processes as

twistd --logfile /var/log/somelogfile.log --pidfile /var/run/somepidfile.pid -y sometacfile.tac

Twisted chops and rotates log files by itself. By default it generates 10mb files chunks.

When current somelogfile.log becomes larger than 10mb, Twisted moves it to somelogfile.log.1 and continues logging to an empty file. If there are more than 2 chunks, they get their names so that larger number at the end corresponds to an older log. To achieve this, Twisted renames N log files, where N is the number of chunks.

In my system there were tens thousands of chunks. I did even realize that rotating them makes a huge stress for HDD, causing unexpected IOWAIT peaks. Moving the chunks to a separate folder eliminated the problem, preventing me from buying some more hardware :)

I'll investigate if it is possible to use logrotated or something similar to handle all this automatically.

вторник, 23 июня 2009 г.

Postgresql transaction counter (transactions per second)

There are transactions counters for each database in a cluster.

If you want to find out how many transactions has your system generated at the moment, you should connect to any database as a superuser (postgres) and

select sum(xact_commit) from pg_stat_database;

Easy, but took some time to find the recipe.