Emre Sokullu
Oct 16
Scalability at GROU.PS Presentation Notes
Fri 16 Oct 2009 08:57:10 | 2 comments

Things to bing (or google if you want to):
nginx, libevent, memcache, danga interactive, gearman, mogilefs, nosql, cassandra, hive, hadoop, scribe, thrift, maatkit (for mysql), mmm (for mysql)


MySQL useful settings:

query_cache_size = 0 # on master
query_cache_type = 0 # on master
thread_concurrency = 8 # total cores
max_connections = 750 # shouldn’t exceed that
innodb_buffer_pool_size = 10G  # a little less than the total amount

And our typical sysctl additions; as I've promised - the configuration that lets us serve 1PB per month to 3 million unique visitors:

net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_synack_retries = 2
## Emre edited
# http://www.oracle-base.com/articles/11g/OracleDB11gR1InstallationOnFedora8.php
kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
# semaphores: semmsl, semmns, semopm, semmni
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default=4194304
#net.core.rmem_max=4194304
net.core.wmem_default=262144
#net.core.wmem_max=262144
fs.file-max=5049800
vm.swappiness=10
## Emre edited
# from http://forums.softlayer.com/showthread.php?t=3252
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_window_scaling = 1
net.ipv4.ip_nonlocal_bind=1
# http://rackerhacker.com/2007/08/24/apache-no-space-left-on-device-couldnt-create-accept-lock/
kernel.msgmni = 1024
kernel.sem = 250 256000 32 1024
net.ipv4.ip_conntrack_max = 524288
net.ipv4.netfilter.ip_conntrack_max = 524288


Comments
Hidayet Dogan 4 months ago
Thanks for the great presentation and tricks. There are lots of solutions about nosql and non-rdbms (especially key/value storage engines) trend. I'm currently researching about non-relational (key/value) database/storage systems and full text searching solutions. I found MongoDB and Xapian and I like their way (C/C++ always rules). The big advantage is MongoDB has a easy to use PHP extension, so Xapian too. So it'll probably decrease development process. I've tested Lucene but it's a bit slow and ugly (I won't lie - the truth is I do not like Java) Zend Framework's Lucene implementation has lots of bugs (it's a really crap implementation, I hope they'll fix it soon). MemcacheDB is in my todo list for key/value storage solution. I'll send my benchmark results. It would be great to hear other grou.ps experiments about storage solutions (especially key/value storages and full text search engines).
Hidayet Dogan
Emre Sokullu 4 months ago
memcachedb is not what you really want at this stage because it's not distributed; so it's great some instances because it gives you the speed of memcache with persistent storage option; but it's not a nosql replacement. best solutions are probably hive - cassandra (facebook) and tokyo cabinet (mixi) - because they're already being used by very large social networks. the others have no serious implementation yet - mongodb has some good backers and it seems user-friendly but they need to convince some large players to gain more credibility among the developer community. voldemort is being used by linkedin but it's more like memcachedb afaik. so the market is too fragmented and there's no maturity yet.
Emre Sokullu



or
Connect with Facebook
Destekliyoruz...
Twitter
More about me...
My Flickr Stream
Notifications