Friday
Oct262007
Paper: Wikipedia's Site Internals, Configuration, Code Examples and Management Issues

Wikipedia and Wikimedia have some of the best, most complete real-world documentation on how to build highly scalable systems. This paper by Domas Mituzas covers a lot of details about how Wikipedia works, including: an overview of the different packages used (Linux, PowerDNS, LVS, Squid, lighttpd, Apache, PHP5, Lucene, Mono, Memcached), how they use their CDN, how caching works, how they profile their code, how they store their media, how they structure their database access, how they handle search, how they handle load balancing and administration. All with real code examples and examples of configuration files. This is a really useful resource.
Reader Comments (1)
Very detailed document really covering most (or all?) topics mentioned in the post.
I haven't yet finished reading it, still in progress, but that's already absolutely clear that it's worth reading, thanks for the link!