среда, 21 сентября 2011 г.

Cassandra migration from 0.6 to 0.7

Cassandra is not mature. I discovered data corruption errors in 0.6. I found nothing that could help me to fix this and I decided to migrate to 0.7 hoping the errors are fixed there.

All you have to do is to follow NEWS file instructions on migration. But there are three pitfalls:

libjna problem in DEB package. Ubuntu has earlier version than required by Cassandra. But package installs fine (wrong dependency version numbers). This results to very strange effects and errors. To fix this you want to install libjna manually, as described here.

Saved caches problem. Before starting up 0.7 you have to manually delete old saved_caches dir. Otherwise you get "Negative array size" exceptions on start up.

Java heap size problem. After fixing previous problems, I discovered a performance degrade in production. Analyzing this I noticed that Java process occupies 13 GB (of 24) RAM. With 0.6 it took about 1-2 GB. In 0.7 Cassandra init scripts set both minimal and maximal (-Xms - Xmx) Java heap sizes to RAM/2. While it is ok for maximum, setting -Xms to 12 GB means that this memory is not going to be used for your actual data. Cassandra accesses data via mmap, and mmap only accesses data in system page cache. Which is shrinked by 12 GB (Java heap). You have to fix manually /etc/cassandra/cassandra-env.sh and set heap to 2 GB (or so).

воскресенье, 22 мая 2011 г.

ThinkPad X220

Finally got my X220. First, replaced HDD with Intel SSD 160GB G2. It required small hardware tweak: X220 has 7mm HDD and SSD was about 12mm height. I had to remove plastic frame from SSD.

DisplayPort turned out to be disadvantage. Only few display models support DisplayPort. And very-very few of them have corresponding cable in the box.

All the DisplayPort - (HDMI / DVI) cables seem to be halfworking.

вторник, 16 ноября 2010 г.

When to touch swappiness

There are lots of discussion on the lists on whether to touch or not to touch /proc/sys/vm/swappiness parameter and there is not definitive answers on that. I figured out a situation when tuning the parameter can really improve the performance.


On the machine:
  • RAID-1 of three HDDs
  • 12 GB RAM
  • Apache Cassandra instance with 25 GB of data
  • ejabberd instance
The IO is created by Cassandra, which reads many random data pages and occasionally writes sequential 100-200 Mb chunks of data. Also some IO is created by swapping ejabberd memory in and out.

So most write load is created by swapping out random ejabberd memory pages. And we know that RAID-1 is N times better on read than on write. Decreasing swappiness parameter from 60 to 20 I moved IO load from write to read. There left almost no random spaw writes.

The IO load has really decreased. Not a huge optimization, but worth doing.

суббота, 13 ноября 2010 г.

Apache Cassandra experience

At one of my projects I switched from Postgresql to Cassandra. There were reasons for the switch.

First. For each user I had to keep an inbox for storing incoming messages and events. What is inbox? It is a sorted collection of items. Items are accessed using ranged queries. This caused huge IO overhead on Postgres, because of lack of clustered indexes. All "tables" in Cassandra are clustered, because they are kept as SST (sorted string tables).

Second. My application had huge write throughput. Postgres is good at write with all that write-ahead logs and absence of table-locks on write. And even after write-aware optimizations it still was not enough. Cassandra's data write process is completely different. And it better suits my needs.

Third. Application servers are Python Twisted applications. There is one Postgres binding for Twisted and it is abandoned and buggy. Cassandra API is available via Thrift, which in turn supports Twisted. I recommend great Telephus wrapper for Thrift and Twisted.

At Cassandra's IRC channel people are telling each other of their Cassandra clusters. I look a bit stupid when saying I have a single node. But who cares? If it works better than Postgres for me - why not?

Disclaimer: I am not telling here that Cassandra is better than Postgresql. It just suits better in this certain application. I use Postgresql a lot at in many other projects.

вторник, 9 ноября 2010 г.

Google AppEngine Experience

At first glance, AppEngine is really nice with all that cloud-computing. Pay only for what you use. Scale indefinitely. Of course, you have some limitations, like custom (Python or Java) environment with predefined APIs. But APIs are really good and mostly sufficient.

At second glance, AppEngine is really, really nice! You'll fine a great toolset in SDK and application management console. Version management, quota settings, convenient shell scripts in SDK for deployment and testing. Also log managers, kind of simple profiler, etc. I cant imagine how many efforts were spent on the toolset.

At third glance you'll find AppEngine unusable.

  • After two years of being released there are unexpected errors in the management console. Sometimes I cannot enter it for some hours.
  • When you need to delete a table from datastore - cross your fingers. Sometimes a certain table becomes corrupted and you cannot delete it. Only application recreation helps.
  • AppEngine pricing claims 10 cents for CPU hour. Good. But you have to use the CPU through the API. When I tried to upload my 1 GB database to AppEngine, it took some hours of real time and some days of AppEngine CPU time. It cost me about $60 just to upload my database! I have to admit, this is hard part. But Postgresql does this database back and forth in minutes!
  • Finally, I managed to port my application and to upload all the data. But the cost per pageview is tremendous. I would cost me hundreds bucks a month instead of current inexpensive dedicated server (which is busy about 10% at peaks).

вторник, 28 сентября 2010 г.

Relay auth to an XMPP component

Standard ejabberd auth modules include odbc, ldap, external (with script). Also there is original module "internal" which hosts users data in Mnesia. But sometimes these modules are not enough.

Writing a custom auth module for ejabberd is easy. Copy, for example, ejabberd_auth_internal.erl and replace its interface methods with your own.

Recently I needed to relay auth to an XMPP component. I have to make an IQ inside check_password method. The problem is that check_password is called in user session process and routing for the session does not work yet. This means, you wont receive XMPP stanzas in this method by calling receive.

Take a look at the working snippet:


check_password(User, Server, Password) ->
    SelfJid = jlib:string_to_jid(Server),
    AuthJid = jlib:string_to_jid("profile1." ++ Server),
    IQGet = #iq{
      type = get,
      sub_el = [{xmlelement, "query", [{"xmlns", ?NS_SUP_AUTH}, {"user", User}, {"password", Password}], []}]
     },
    Pid = self(),

    F = fun(IQReply) ->
                Pid ! IQReply
        end,

    ejabberd_local:route_iq(SelfJid, AuthJid, IQGet, F),

    receive #iq{type = result} ->
            true;
    Other ->
            ?INFO_MSG("Auth IQ for ~s failed: ~p", [User, Other]),
            false
    end.


profile1.server.com is a JID of the auth component. The component receives an IQ with user id and password and returns "result" IQ if auth is fine, "error" otherwise.

The trick here is to use ejabberd_local:route_iq. But then we need to block the call flow of check_password until the auth component returns us the result. route_iq takes a function parameter which is called in local router process. It routes the reply back to a function basing on its own IQ id -> function map. Another trick is to make our original (client) process pid to reside in function's closure. Then we can safely block with receive and wait for a message from the function.

пятница, 10 сентября 2010 г.

Thinkpad T61 warranty repair

I've been using T61 for about two years and a half.
Recently the screen flashed and turned off while I was working. It didnt turn on any more.

I had checked my serial number at Lenovo site and happily discovered that few months of warranty left.

It took almost three weeks to fix. They changed LCD display (this was expected), then motherboard and the panel with touchpad. Seems motherboard had burnt out together with LCD. And plastic panel was a bit damaged - they replaced it too :)