Entries in Book (20)

Wednesday
Oct162013

Interview With Google's Ilya Grigorik On His New Book: High Performance Browser Networking

If you are Google you don't just complain about performance on the web, you do something about it. Doing something about web performance is the job of one Ilya Grigorik, Developer Advocate, Make the Web Fast at Google, and author of a great new book: High Performance Browser Networking: What every web developer should know about networking and web performance

That's a big topic you might be saying to yourself. And it is. The book is 400 plus information packed pages. But never fear. Ilya writes in a very straightforward style. It’s like a man page for the web. Which is a good thing.

In case you are not familiar with Ilya, he's the perfect choice for writing such an ambitious book. For years Ilya has been producing excellent content on his blog and if you search YouTube you'll find presentation after presentation on the topics found in the book. Authority established.

Reading the book I was struck by what a complicated beast or little World Wide Web has become. That's clear from just the chapter titles: Primer on Latency and Bandwidth, Building Blocks of TCP, Building Blocks of UDP, Transport Layer Security, Introduction to Wireless Networks, WiFi, Mobile Networks, Optimizing for Mobile Networks, Brief History of HTTP, Primer on Web Performance, HTTP 1.X, HTTP 2.0, Optimizing Application Delivery, Primer on Browser Networking, XMLLHttpRequest, Server Side Events, WebSocket, and WebRTC. 

I've often imagined there's a white board at Google with "Don't Erase!" written in red at the top. On it is a complicated diagram of every part of the web and Google's master plan for making that part better. The book reads a little like that imagined diagram. The Primer on Latency is a real eye opener. If you wonder why people are always going on about latency then this chapter is for you. The chapter on Mobile Networks is really good, it fills in a lot of details in an excessively complicated space. With HTTP 2.0 on the horizon you can learn about the entire thing here in full on gory detail. I'd never heard of Server Side Events before so that was enlightening. And the WebRTC chapter has an amazing amount of detail on an exciting new browser capability. 

If you've been burned by books that were just reprints of manuals and specification documents, that's not the case with this book. It's full of practical real-world advice. For example, there' one example of how "Apple engineers saw a 300% performance improvement for users on slower networks once they made better reuse of existing TCP connections within iTunes, via HTTP keepalive and pipelining!" Bits like this can be found throughout the book.

And many chapters end with a practical set of things you can do to implement all the stuff you've learned part. For example, the WebRTC chapter ends with a Performance Checklist that begins with a few sections like:

Signaling service
• Use a low-latency transport.
• Provision sufficient capacity.
• Consider using signaling over DataChannel once connection is established.
Firewall and NAT traversal
• Provide a STUN server when initiating RTCPeerConnection.
• Use trickle ICE whenever possible—more signaling, but faster setup.
• Provide a TURN server for relaying failed peer-to-peer connections.

 

Almost every chapter has solid practical advice like this on how to make the web faster for your application.

Here's my email interview with Ilya Grigorik on High Performance Browser Networking. Enjoy.

Please Tell Us Who You Are And What You've Brought To Show And Tell Today?

Click to read more ...

Monday
Jun292009

How to Succeed at Capacity Planning Without Really Trying : An Interview with Flickr's John Allspaw on His New Book


Update 2: Velocity 09: John Allspaw, 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr. Insightful talk. Some highlights: Change is good if you can build tools and culture to lower the risk of change. Operations and developers need to become of one mind and respect each other. An automated infrastructure is the one tool you need most. Common source control. One step build. One step deploy. Don't be a pussy, deploy. Always ship trunk. Feature flags - don't branch code, make features runtime configurable in code. Dark launch - release data paths early without UI component. Shared metrics. Adaptive feedback to prioritize important features. IRC for communication for human context. Best solutions occur when dev and op work together and trust each other. Trust is earned by helping each other solve their problems. Look at what new features imply for operations, what can go wrong, and how to recover. Provide knobs and levers to help operations. Devs should have access to production machines. Fire drills to train. No finger pointing - fix stuff first. Design like you'll get woken up first when there's a problem. Say you're sorry. Not easy - like any relationship.
Update: Operational Efficiency Hacks Web20 Expo2009 by John Allspaw. 131 picture perfect slides on operations porn. If you're interested in that kind of thing.

Dream with me a little bit. Your startup becomes wildly successful. Hard work and random chance have smiled on you. To keep flirting with lady luck your system must scale. But how much stuff (space, hardware, software, etc) will you need to handle the growth, when will you need it and when will you need more?

That's what Flickr's John Allspaw helps you figure out in his ground breaking new book on capacity planning: The Art of Capacity Planning: Scaling Web Resources.

When I read statements about The Art of Capacity Planning like capacity planning is a term that to me means paying attention, All the information you need to make an educated forecast is in your historical metrics, and startups that are going to experience massive growth simply don't have time for anything but a 'steering by your wake' approach, I get the same sea change feeling I felt when the industry ran from waterfall design and embraced agile design. Current capacity planning is heavy. All up-front. Too analytical and too divorced from real life.

Other capacity planning books assault you with models, math, and simulations. Who has the time? John has developed a common sense, low math approach to capacity planning that works using the system you already have. John's goal is to have you say: Oh, right, duh. That's common sense, not voodoo.

Here's my email interview with John Allspaw on The Art of Capacity Planning. Enjoy.

Please tell us who you are and what you've brought to show and tell today?

I'm John Allspaw. I manage the Operations team at Flickr.com, and I've written a book (The Art of Capacity Planning: Scaling Web Resources) about capacity planning for growing websites.

After spending a good chunk of your life writing this book, can you summarize it in just a few sentences so people will know why it should matter to them?


This book is basically a guide to adaptive capacity planning for growing websites. It's an approach that relies much less on benchmarking and simulation, than on the close observation of production loads to guide future decisions. It's not rocket science, and I'm hoping people can use it to justify the what, why, and when of getting more resources to allow them to grow as fast as they need to. It's worked really well for me at Flickr and other organizations.

Give me your ripple of evil. What happens without capacity planning?

Capacity planning is a term that to me means paying attention. Web applications can fail in all sorts of dramatic ways, and you're not going to foresee all of them. What you can do, however, is make use of what you do know about what happens in your real world on a regular basis. Things like: my database can do X queries per second before it keels over. Or my cache can only keep Y minutes worth of changing objects. You're not going to predict every failure mode of the whole system, but knowing the failure modes of individual pieces should be considered mandatory. Armed with that, you can make decent forecasts about the future.

I'm a guy or gal at a startup who's freaking out because my boss asked me how much hardware we need for the next quarter/year? What do I do now?

Buy my book? ☺ All the information you need to make an educated forecast is in your historical metrics. You do have system and application-level statistics, right? Tying your system level stats (CPU, memory, network, storage, etc.) to application-level metrics (users, posts, photos, videos, page views, widgets sold, etc.) is key, because then you have history to back up the guesstimates. Your business, product or marketing team also has their own guesses for some of those application-level metrics, so you should get the two forecasts together and see where/how they match up or differ. Capacity should enable the business, not hinder it.

So your book, is it the best book on Capacity Planning or the greatest book on Capacity Planning?

My book is the best book on capacity planning. If it were the 'greatest' book, it'd be a lot bigger than it is. It's good for scaling your hardware. It's not good for flattening out posters.

My site is gonna explode and I don't know at what point it's going to die. What do I do now?

Argh! Panic! Handle the current explosion, and make it priority one to find out the limits of your capacity when the emergency is over.

Try to panic gracefully. If it's dying right now, and you can't easily add any more capacity, then try some of the tried and trusted WebOps 101 tricks mentioned everywhere, including this blog:
- disable features (preferably the heavier load-causing ones)
- cache previously dynamic content into static bits
- avoid loaded backend calls by serving stale content
Of course it's easier to do those things when you have easy config flags to turn things on or off, and a list to run through of what things are acceptable to serve stale and static. We currently have about 195 'features' we can turn off at Flickr in dire circumstances. And we've used those flags when we needed to.

Having said all of that, knowing when your resources are going to die should be mandatory, not optional. Know how many qps your databases, webservers, caching systems, and storage can handle before degradation. Test that stuff. With production traffic. If at all possible, with live production traffic, not just recorded and replayed production loads.

How do you compare your approach with well known approaches like Guerrilla Capacity Planning? Isn't focusing on queuing theory, Little's Law, and Possion arrival rates misguided? What will really help people on-the ground?

My approach is a bit different from Mr. Gunther's, although his book was an inspiration for mine. I do think that having a general understanding of queuing theory and the mathematics of open and closed systems can be important, don't get me wrong. But many of the startups that are going to experience massive growth simply don't have time for anything but a 'steering by your wake' approach, and done right, I think that approach will serve them well. I think some people recognize this, but in my experience, I'd say the development timelines are even tighter than most people realize.

Almost any time and effort in constructing and running a simulation, model, or benchmark that involves the main moving parts of a web site's back-end is pretty much wasted due to how quickly application logic, use cases, and even hardware configurations can change. For example, by the time I could construct a useful model to capture the webserver-database interactions for Flickr, the results won't resemble production load, since the development cycle we have is so tight. As I write this, in the last week there were 50 code deploys of 550 changes by 19 Flickr staff. I realize that's a lot more than a lot of organizations, but that rate of change requires that the capacity planning process is easily adjustable.

I also found the existing books on the topic to be pretty dry with the math. The foundations of queuing theory are interesting, but proofs of Little's Law isn't going to quickly justify to my finance guy that we need 11 more webservers.

I have a cloud, do I really need to capacity plan anymore? That's so old school. Can't I decrease my administrator-to-server ratio and reduce the cost of managing systems by removing capacity planning resources?

Nope, just trust The Cloud™ that everything will be all right. No need to pay any attention at all. Oh wait, that's a magic unicorn talking. Cloud services are a resource just like any in-house capacity: they have limits that should be paid attention to. The best use cases of cloud computing both recognize the benefits (shrinking 'procurement' times, for example) and the limitations. And those limitations are going to vary from application to application, organization to organization.

You can build a real brand around a name like "Guerrilla Capacity Planning." Don't you think you could have come up with a better name for your approach?

Yeah, I guess I should have a name for it. How about "Paying Attention Capacity Planning"?

You recommend testing the limits of your hardware and software on a production site. Are you crazy?

It's very possible that I'm crazy. But, using production traffic to define your resources ceilings in a controlled setting allows you to see firsthand what would happen when you run out of capacity in a particular resource. Of course I'm not suggesting that you run your site into the ground, but better to know what your real (not simulated) loads are while you're watching, than find out the hard way. In addition, a lot of unexpected systemic things can happen when load increases in a particular cluster or resource, and playing "find the butterfly effect" is a worthwhile exercise.

Where does capacity planning sit in the stack? It seems to impact the entire application stack, but capacity planning is more of an operations thing and that may not have much impact in engineering.

Capacity planning is a process, and should sit in the stack along with the other things that happen on a regular basis, like bug scrubs, like weekly meetings, etc. I'm spoiled as an Ops manager because we've got developers at Flickr who very much think like operations people. We're all addicted to graphs, and we're all pretty aware of how close each piece of the infrastructure is to becoming too 'hot', and we act accordingly. Planning, procurement, and measurement of capacity might lie on operations' shoulders, but intelligent use of it is everyone's business. I'm also blessed to have product management and customer care teams who are also aware of the state of capacity. As projects pop up that warrant capacity to be part of the considerations, it is. You simply can't launch things like FlickrVideo and the Yahoo Photos migration without capacity being part of the requirements, so we've got a good feedback loop going with respect to operations and capacity.

Studies say 4/5 of people like a trip to the dentist more than they like capacity planning. Why does capacity planning have such a bad rep?

Because no one wants to guess and be wrong? Because most of the cap planning literature out there are filled with boring math?

Does the move to Web 2.0 make traditional approaches to capacity planning more difficult?

I would say yes, but of course I'm biased. In many ways, use of the site is simply out of our hands. We gather metrics on as many aspects of the site as we can get our hands on, and add more all the time. Having an open API makes things interesting, because the use cases can vary wildly. Out of nowhere, a simple application can reveal an edge case that we hadn't foreseen, so we might have to adjust forecasts quickly. As to it being more difficult: I don't know. When I'm wrong, people can't see photos. If the guy at wellsfargo.com is wrong, then people can't get their money. What's more annoying? :)

I really like your dashboard. Most dashboards say what happened, yours has an estimated days left before capacity runs out, which makes it actionable. How accurate have those numbers been?

Well that particular dashboard is still being built here, but the general design is to capture those metrics on a regular basis, and to use it as a guide. The numbers for some clusters have been pretty trustworthy. But they do fluctuate in how accurate they are, since "days left" are obviously affected by things outside of the forecasting process. Sometimes, a new feature is planned that requires pounding the database shards, or bumps CPU on the webservers by a noticeable amount. It's because of these things that the dashboard is only a piece of the puzzle. Awareness and communications with product management and development have to inform planning decisions, as well as the historic metrics. Again: no crystal balls here.

One of the equations you use is a "UR LIMITZ = Ceiling * Factor of Safety". Theo Schlossnagle has noted the evolution of phenomenal spikes in traffic, even for large sites. How do you have enough capacity as a safety factor if traffic is so spikey? What are the implications for forecasting?

Start with the assumption that no one can accurately predict the future. Capacity planning isn't the only part of successful web operations, and not everything is a capacity issue. Theo nails what happens in the real-world, for sure. The answer is: you estimate the best you can with the history that you have. Obviously, one mistake is to make forecasts ignoring the spikes you've experienced in the past. Something important to consider is how expected (or not) your spikes are. If you sell things, you might have a seasonal holiday spike or plateau in traffic. If you're a content site that frequently gains traction with news-related sites (like Theo's example) and from time to time, you get massive unexpected spikes, then your factor of safety should include considerations for those spikes. But of course the flipside of that might mean that you have a boatload of servers doing nothing except wasting money while waiting for massive spikes that may or may not come. So it's a balance. Be reasonable with planning, follow the four guidelines that Theo points out in his post, and you've got yourself a strategy to deal with those unexpected spikes.

At what point does it make sense for me to buy my own equipment?

It's a difficult question. I've seen groups go from self-run colocation to managed hosting and cloud services, and others go the opposite way, for all legit reasons. Just as there are limitations with managed hosting, those limitations might not matter until you're massive. I do believe that there is a point at which owning your own equipment makes a lot of sense, because you've blown past the average needs of a managed hosting or cloud customer. Your TCO might be a lot different than mine, so to state that there's a single point for everyone would just be dumb.

Quick fire round. What's your quick reaction to:

1. Let's just go with a working prototype for now. We can change it when we grow big.

Fine, for some definitions of "working", "change", and "big". I can't complain too much about that approach in some sense because that's how a lot of Flickr's backend evolved. But on the other hand, all you need is to fail a couple of times with prototypes to figure out what sort of homework needs to be done before launching something that is potentially explosive. So again, there's a balance. Don't be lazy, but don't be rigid and too fearful.

2. Our VC told us that we're worrying about scalability too early. They doesn't want us to blow our scarce resources on preparing for success.

Your VC is smart. If they're really smart, they'll also suggest worrying about scalability before it's too late. This answer isn't a cop-out, it's just reality. Realize that being scalable means being able to easily add capacity wherever you need it, whenever you need it. Buying too much equipment too soon is just as insane as hiring people that sit around and do nothing. The trick is knowing what "too much" and "too soon" is, and it's going to differ from company to company, from product to product.

3. Premature optimization is the root of all evil.

I've not made it a secret that I think this quote has given many an engineer (both dev and ops) reasoning to make dumb decisions. I actually agree with the idea behind the quote, but I also think that it's another one of those balancing acts. Knuth (or Hoare) also said "We should forget about small efficiencies, say about 97% of the time…" Another good question might be: which 3% are you going to pay attention to?

4. We plan to throw hardware at the problem.

Ok. When does that hardware need to come, how much of it, and what will you do when you realize more hardware won't fix a particular problem? The now classic limitations of the single-write-master, many-read-slaves database architecture is a great example of this. Experienced devs and ops people consider the possibility that hardware can't solve all problems.

Note: these questions were adapted from How Important Is Being Scalable?.

Now that you have some free time again, what are you going to do with your life?

Eat. Sleep. Pay attention to my family. ☺

Related Articles

  • Capacity Management for Web Operations by John Allspaw

  • Thursday
    Jun042009

    New Book: Even Faster Web Sites: Performance Best Practices for Web Developers

    Performance is critical to the success of any web site, and yet today's web applications push browsers to their limits with increasing amounts of rich content and heavy use of Ajax. In his new book Even Faster Web Sites: Performance Best Practices for Web Developers, Steve Souders, web performance evangelist at Google and former Chief Performance Yahoo!, provides valuable techniques to help you optimize your site's performance.

    Souders' previous book, the bestselling High Performance Web Sites, shocked the web development world by revealing that 80% of the time it takes for a web page to load is on the client side. In Even Faster Web Sites, Souders and eight expert contributors provide best practices and pragmatic advice for improving your site's performance in three critical categories:

    • JavaScript - Get advice for understanding Ajax performance, writing efficient JavaScript, creating responsive applications, loading scripts without blocking other components, and more.

    • Network - Learn to share resources across multiple domains, reduce image size without loss of quality, and use chunked encoding to render pages faster.

    • Browser - Discover alternatives to iframes, how to simplify CSS selectors, and other techniques.

    Speed is essential for today's rich media web sites and Web 2.0 applications. With this book, you'll learn how to shave precious seconds off your sites' load times and make them respond even faster.

    About the Author

    Steve Souders works at Google on web performance and open source initiatives. His book High Performance Web Sites explains his best practices for performance along with the research and real-world results behind them. Steve is the creator of YSlow, the performance analysis extension to Firebug. He is also co-chair of Velocity 2008, the first web performance conference sponsored by O'Reilly. He frequently speaks at such conferences as OSCON, Rich Web Experience, Web 2.0 Expo, and The Ajax Experience.

    Steve previously worked at Yahoo! as the Chief Performance Yahoo!, where he blogged about web performance on Yahoo! Developer Network. He was named a Yahoo! Superstar. Steve worked on many of the platforms and products within the company, including running the development team for My Yahoo!.

    Monday
    Mar162009

    Books: Web 2.0 Architectures and Cloud Application Architectures

    I am excited about the upcoming release of two books on Web 2.0 and Cloud Application Architectures by O'Reilly. Web 2.0 Architectures (estimated release in May 2009) What entrepreneurs and information architects need to know Using several high-profile Web 2.0 companies as examples, authors Duane Nickull, Dion Hinchcliffe, and James Governor have distilled the core patterns of Web 2.0 coupled with an abstract model and reference architecture. The result is a base of knowledge that developers, business people, futurists, and entrepreneurs can understand and use as a source of ideas and inspiration. Featured architectures include Google, Flickr, BitTorrent, MySpace, Facebook, and Wikipedia. Cloud Application Architectures (estimated release in April 2009) Building Applications and Infrastructure in the Cloud This book by George Reese offers tested techniques for creating web applications on cloud computing infrastructures and for migrating existing systems to these environments. Specifically, you'll learn about the programming and system administration necessary for supporting transactional web applications in the cloud -- mission-critical activities that include orders and payments to support customers. The second book is available online at O'Reilly as a Rough Cuts Version so you might already had a chance to check it out. If so, do you like it?

    Click to read more ...

    Friday
    Oct102008

    The Art of Capacity Planning: Scaling Web Resources

    Update 3: The book was released! Find it on Amazon at The Art of Capacity Planning. Update 2: Maybe the iPhone can use a little capacity planning? What's Behind the iPhone 3G Glitches: One source says Apple programmed the Infineon chip to demand a more powerful 3G signal than the iPhone really requires. So if too many people try to make a call or go on the Internet in a given area, some of the devices will decide there's insufficient power and switch to the slower network—even if there is enough 3G bandwidth available. Update: To get a taste of what will be served, mySQL DBA has a nice post titled Capacity Planning, Architecture, Scaling, Response time, Throughput. You learn how to figure out when your application will break by building a 3rd order polynomial. Cool stuff! John Allspaw who is the Operations Engineering Manager at Flickr is about to publish a book with O'Reilly. There are not much details so far but it seems interesting and relevant to High Scalability. Allspaw combines personal anecdotes from many phases of Flickr's growth with insights from his colleagues in many other industries to give you solid guidelines for measuring your growth, predicting trends, and making cost-effective preparations. Topics include:

    • Evaluating tools for measurement and deployment
    • Capacity analysis and prediction for storage, database, and application servers
    • Designing architectures to easily add and measure capacity
    • Handling sudden spikes
    • Predicting exponential and explosive growth
    • How cloud services such as EC2 can fit into a capacity strategy
    The Art of Capacity Planning: Scaling Web Resources is available for pre-order on amazon.com

    Click to read more ...

    Monday
    Sep082008

    Guerrilla Capacity Planning and the Law of Universal Scalability

    In the era of Web 2.0 traditional approaches to capacity planning are often difficult to implement. Guerrilla Capacity Planning facilitates rapid forecasting of capacity requirements based on the opportunistic use of whatever performance data and tools are available. One unique Guerrilla tool is Virtual Load Testing, based on Dr. Gunther's "Universal Law of Computational Scaling", which provides a highly cost-effective method for assessing application scalability. Neil Gunther, M.Sc., Ph.D. is an internationally recognized computer system performance consultant who founded Performance Dynamics Company in 1994. Some reasons why you should understand this law: 1. A lot of people use the term "scalability" without clearly defining it, let alone defining it quantitatively. Computer system scalability must be quantified. If you can't quantify it, you can't guarantee it. The universal law of computational scaling provides that quantification. 2. One the greatest impediments to applying queueing theory models (whether analytic or simulation) is the inscrutibility of service times within an application. Every queueing facility in a performance model requires a service time as an input parameter. No service time, no queue. Without the appropriate queues in the model, system performance metrics like throughtput and response time, cannot be predicted. The universal law of computational scaling leapfrogs this entire problem by NOT requiring ANY low-level service time measurements as inputs. The universal scalability model is a single equation expressed in terms of two parameters α and β. The relative capacity C(N) is a normalized throughput given by: C(N) = N / ( 1 + αN + βN (N − 1) ) where N represents either: 1. (Software Scalability) the number of users or load generators on a fixed hardware configuration. In this case, the number of users acts as the independent variable while the CPU configuration remains constant for the range of user load measurements. 2. (Hardware Scalability) the number of physical processors or nodes in the hardware configuration. In this case, the number of user processes executing per CPU (say 10) is assumed to be the same for every added CPU. Therefore, on a 4 CPU platform you would run 40 virtual users. with `α' (alpha) the contention parameter, and `β' (beta) the coherency-delay parameter. This model has wide-spread applicability, including:

    • Accounts for such effects as VM thrashing, and cache-miss latencies.
    • Can also be used to model disk arrays, SANs, and multicore processors.
    • Can also be used to model certain types of network I/O
    • The user-load form is the most common application of eqn.
    • Can be used in combination with measurement tools like LoadRunner, Benchmark Factory, etc.
    Gunther's book: Guerrilla Capacity Planning: A Tactical Approach to Planning for Highly Scalable Applications and Services and its companion Analyzing Computer Systems Performance: With Perl: PDQ offers practical advise on capacity planning. Distilled notes of his Guerilla Capacity Planning (GCaP) classes are available online in The Guerilla Manual.

    Click to read more ...

    Tuesday
    Jul312007

    Book: Scalable Internet Architectures

    As a developer, you are aware of the increasing concern amongst developers and site architects that websites be able to handle the vast number of visitors that flood the Internet on a daily basis. Scalable Internet Architecture addresses these concerns by teaching you both good and bad design methodologies for building new sites and how to scale existing websites to robust, high-availability websites. Primarily example-based, the book discusses major topics in web architectural design, presenting existing solutions and how they work. Technology budget tight? This book will work for you, too, as it introduces new and innovative concepts to solving traditionally expensive problems without a large technology budget. Using open source and proprietary examples, you will be engaged in best practice design methodologies for building new sites, as well as appropriately scaling both growing and shrinking sites. Website development help has arrived in the form of Scalable Internet Architecture.

    Click to read more ...

    Monday
    Jul162007

    Book: High Performance MySQL

    As users come to depend on MySQL, they find that they have to deal with issues of reliability, scalability, and performance--issues that are not well documented but are critical to a smoothly functioning site. This book is an insider's guide to these little understood topics. Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book. In High Performance MySQL you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you'll learn the supporting techniques you need to carry out these tasks, including advanced configuration, benchmarking, and investigating logs. Topics include: * A review of configuration and setup options * Storage engines and table types * Benchmarking * Indexes * Query Optimization * Application Design * Server Performance * Replication * Load-balancing * Backup and Recovery * Security

    Click to read more ...

    Sunday
    Jul152007

    Web Analytics: An Hour a Day

    Web Analytics: An Hour A Day is the first book by an in-the-trenches practitioner of web analytics. It provides a unique insider’s perspective of the challenges and opportunities that web analytics presents to each person who touches the Web in your organization. Rather than spamming you with metrics and definitions, Web Analytics: An Hour A Day will enhance your mindset and teach you how to fish for yourself. Avinash Kaushik is a expert in web analytics and author of the top-rated blog Occam’s Razor (http://www.kaushik.net/avinash). In this book, he goes beyond web analytics concepts and definitions to provide a step-by-step guide to implementing a successful web analytics strategy. His revolutionary approach to web analytics challenges prevalent thinking about the field and guides readers to a solution that will provide truly informed and actionable insights.

    Click to read more ...

    Sunday
    Jul152007

    Book: Building Scalable Web Sites

    Building, scaling, and optimizing the next generation of web applications. Learn the tricks of the trade so you can build and architect applications that scale quickly--without all the high-priced headaches and service-level agreements associated with enterprise app servers and proprietary programming and database products. Culled from the experience of the Flickr.com lead developer, Building Scalable Web Sites offers techniques for creating fast sites that your visitors will find a pleasure to use. Creating popular sites requires much more than fast hardware with lots of memory and hard drive space. It requires thinking about how to grow over time, how to make the same resources accessible to audiences with different expectations, and how to have a team of developers work on a site without creating new problems for visitors and for each other. Presenting information to visitors from all over the world * Integrating email with your web applications * Planning hardware purchases and hosting options to have as much as you need without breaking your wallet * Partitioning and distributing databases to support large datasets and simultaneous transactions * Monitoring your applications to find and clear bottlenecks * Providing services APIs and using services from other providers to increase your site's reach and capabilities Whether you're starting a small web site with hopes of growing big or you already have a large system that needs maintenance, you'll find Building

    Click to read more ...

    Page 1 2