Entries in Caching (29)

Thursday
Oct032019

Redis Cloud Gets Easier with Fully Managed Hosting on Azure

Redis Cloud Gets Easier with Fully Managed Hosting on Azure

ScaleGrid, a rapidly growing leader in the Database-as-a-Service (DBaaS) space, has just launched their new fully managed Redis on Azure service. This Redis management solution allows startups up to enterprise-level organizations automate their Redis operations on Microsoft Azure dedicated cloud servers, alongside their other open source database deployments, including MongoDBMySQL and PostgreSQL.

Redis, the #1 key-value store and top 10 database in the world, has grown by over 300% in popularity over that past 5 years, per the DB-Engines knowledge base. The demand for Redis is skyrocketing across dozens of use cases, particularly for cache, queues, geospatial data, and high speed transactions. This simple database management system makes it very easy to store and retrieve pairs of keys and values, and is commonly paired with other database types to increase the speed and performance of an application. According to the 2019 Open Source Database Report, a majority of Redis deployments are used in conjunction with MySQL, and over half of Redis deployments are used with either PostgreSQL, MongoDB, and Elasticsearch.

ScaleGrid’s Redis hosting service allows these organizations to automate all of their time-consuming management tasks, such as backups, upgrades, scaling, replication, sharding, monitoring, alerts, log rotations, and OS patching, so their DBAs, developers, and DevOps teams can focus on new product development and optimizing performance. Additionally, organizations can customize their Redis persistence and host through their own Azure account which allows them to leverage advanced cloud capabilities like Azure Virtual Networks (VNET), Security Groups, and Reserved Instances to reduce long-term hosting costs up to 60%. 

“Cloud reliability has never been so important,” says Dharshan Rangegowda, Founder and CEO of ScaleGrid. “It’s crucial for organizations to properly configure their Redis deployments for high availability and disaster recovery, as a couple minutes of downtime can be detrimental to a company’s security and reputation.”

ScaleGrid is the only Redis cloud service that allows you to customize your master-slave and cross-datacenter configurations for 100% uptime and availability across 30 different Azure regions. They also allow you to keep full Redis admin access and SSH access to your machines, and you can learn more about their advantages over competitors Compose for Redis, RedisGreen, Redis Labs and Elasticache for Redis on their Compare Redis Providers page.

Tuesday
Sep032019

Top Redis Use Cases by Core Data Structure Types

Top Redis Use Cases by Core Data Structure Types - ScaleGrid Blog

Redis, short for Remote Dictionary Server, is a BSD-licensed, open-source in-memory key-value data structure store written in C language by Salvatore Sanfillipo and was first released on May 10, 2009. Depending on how it is configured, Redis can act like a database, a cache or a message broker. It’s important to note that Redis is a NoSQL database system. This implies that unlike SQL (Structured Query Language) driven database systems like MySQL, PostgreSQL, and Oracle, Redis does not store data in well-defined database schemas which constitute tables, rows, and columns. Instead, Redis stores data in data structures which makes it very flexible to use. In this blog, we outline the top Redis use cases by the different core data structure types.

Data Structures in Redis

Click to read more ...

Tuesday
Feb192019

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

Redis Cluster is the native sharding implementation available within Redis that allows you to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. At ScaleGrid, we recently added support for Redis Clusters on our platform through our fully managed Redis hosting plans. In this post, we’re going to introduce you to the advanced Redis Cluster sharding opportunities, discuss its advantages and limitations, when you should deploy, and how to connect to your Redis Cluster.

Sharding with Redis Cluster

Click to read more ...

Thursday
Oct202016

Future Tidal Wave of Mobile Video

In this article I will examine the growing trends of Internet Mobile video and how consumer behaviour is rapidly adopting to a world of ‘always on content’ and discuss the impact on the underlying infrastructure.

Click to read more ...

Monday
Oct182010

NoCAP

In this post i wanted to spend sometime on the CAP theorem and clarify some of the confusion that i often see when people associate CAP with scalability without fully understanding the implications that comes with it and the alternative approaches

You can read the full article here

Saturday
Sep122009

How Google Taught Me to Cache and Cash-In

A user named Apathy on how Reddit scales some of their features, shares some advice he learned while working at Google and other major companies.

To be fair, I [Apathy] was working at Google at the time, and every job I held between 1995 and 2005 involved at least one of the largest websites on the planet. I didn't come up with any of these ideas, just watched other smart people I worked with who knew what they were doing and found (or wrote) tools that did the same things. But the theme is always the same:

  1. Cache everything you can and store the rest in some sort of database (not necessarily relational and not necessarily centralized).
  2. Cache everything that doesn't change rapidly. Most of the time you don't have to hit the database for anything other than checking whether the users' new message count has transitioned from 0 to (1 or more).
  3. Cache everything--templates, user message status, the front page components--and hit the database once a minute or so to update the front page, forums, etc. This was sufficient to handle a site with a million hits a day on a couple of servers. The site was sold for $100K.
  4. Cache the users' subreddits. Blow out the cache on update.
  5. Cache the top links per subreddit. Blow out cache on update.
  6. Combine the previous two steps to generate a menu from cached blocks.
  7. Cache the last links. Blow out the cache on each outlink click.
  8. Cache the user's friends. Append 3 characters to their name.
  9. Cache the user's karma. Blow out on up/down vote.
  10. Filter via conditional formatting, CSS, and an ajax update.
  11. Decouple selection/ranking algorithm(s) from display.
  12. Use Google or something like Xapian or Lucene for search.
  13. Cache "for as long as memcached will stay up." That depends on how many instances you're running, what else is running, how stable the Python memcached hooks are, etc.
  14. The golden rule of website engineering is that you don't try to enforce partial ordering simultaneously with your updates.
  15. When running a search engine operate the crawler separately from the indexer.
  16. Ranking scores are used as necessary from the index, usually cached for popular queries.
  17. Re-rank popular subreddits or the front page once a minute. Tabulate votes and pump them through the ranker.
  18. Cache the top 100 per subreddit. Then cache numbers 100-200 when someone bothers to visit the 5th page of a subreddit, etc.
  19. For less-popular subreddits, you cache the results until an update comes in.
  20. With enough horsepower and common sense, almost any volume of data can be managed, just not in realtime.
  21. Never ever mix your reads and writes if you can help it.
  22. Merge all the normalized rankings and cache the output every minute or so. This avoids thousands of queries per second just for personalization.
  23. It's a lot cheaper to merge cached lists than build them from scratch. This delays the crushing read/write bottleneck at the database. But you have to write the code.
  24. Layering caches is a clasisc strategy for milking your servers as much as possilbe. First look for an exact match. If that's not found, look for the components and build an exact match.
  25. The majority of traffic on almost all websites comes from the default, un-logged-in front page or from random forum/comment/result pages. Make sure those are cached as much as possible.. If one or more of the components aren't found, regenerate those from the DB (now it's cached!) and proceed. Never hit the database unless you have to.
  26. You (almost) always have to hit the database on writes. The key is to avoid hitting it for reads until you're forced to do so.
Wednesday
Jul292009

Strategy: Let Google and Yahoo Host Your Ajax Library - For Free

Update: Offloading ALL JS Files To Google. Now you can let Google serve all your javascript files. This article tells you how to do it (upload to Google Code Project) and why it's a big win (cheap, fast, caching, parallel downloads, save bandwidth).

Don't have a CDN? Why not let Google and Yahoo be your CDN? At least for Ajax libraries. No charge. Google runs a content distribution network and loading architecture for the most popular open source JavaScript libraries, which include: jQuery, prototype, script.aculo.us, MooTools, and dojo. The idea is web pages directly include your library of choice from Google's global, fast, and highly available network. Some have found much better performance and others experienced slower performance. My guess is the performance may be slower if your data center is close to you, but far away users will be much happier. Some negatives: not all libraries are included, you'll load more than you need because all functionality is included. Yahoo has had a similar service for YUI for a while. Remember to have a backup plan for serving your libraries, just in case.

Monday
Jul272009

Handle 700 Percent More Requests Using Squid and APC Cache

This post on www.ilovebonnie.net documents some impressive system performance improvements by the addition of Squid Cache (a caching proxy) and APC Cache (opcode cache for PHP).

  • Apache is able to deliver roughly 700% more requests per second with Squid when serving 1KB and 100KB images.
  • Server load is reduced using Squid because the server does not have to create a bunch of Apache processes to handle the requests.
  • APC Cache took a system that could barely handle 10-20 requests per second to handling 50-60 requests per second. A 400% increase.
  • APC allowed the load times to remain under 5 seconds even with 200 concurrent threads slamming on the server.
  • These two caches are easy to setup and install and allow you to get a lot more performance out of them.
The post has an in-depth discussion and a number of supporting charts. The primary point is how simple it can be to improve performance and scalability by adding caching.

Thursday
Jul092009

No to SQL? Anti-database movement gains steam – My Take

In this post i wrote my view on the anti SQL database movement and where the alternative approach fits in:

- SQL databases are not going away anytime soon.
- The current "one size fit it all" databases thinking was and is wrong.
- There is definitely a place for a more a more specialized data management solutions alongside traditional SQL databases.

In addition to the options that was mentioned on the original article i pointed out the the in-memory alternative approach and how that fits into the puzzle. I used a real life scenario: scalable Social network based eCommerce site where i outlined how in-memory approach was the only option they could scale and meet their application performance and response time requirements.

Thursday
Apr162009

Serving 250M quotes/day at CNBC.com with aiCache

As traffic to cnbc.com continued to grow, we found ourselves in an all-too-familiar situation where one feels that a BIG change in how things are done was in order, the status-quo was a road to nowhere. The spending on HW, amount of space and power required to host additional servers, less-than-stellar response times, having to resort to frequent "micro"-caching and similar tricks to try to improve code performance - all of these were surfacing in plain sight, hard to ignore. While code base could clearly be improved, the limited Dev resources and having to innovate to stay competitive always limits ability to go about refactoring. So how can one go about addressing performance and other needs without a full blown effort across the entire team ? For us, the answer was aiCache - a Web caching and application acceleration product (aicache.com). The idea behind caching is simple - handle the requests before they ever hit your regular Apache<->JK<->Java<->Database response generation train (we're mostly a Java shop). Of course, it could be Apache-PHP-Database or some other backend system, with byte-code and/or DB-result-set caching. In our case we have many more caching sub-systems, aimed at speeding up access to stock and company-related information. Developing for such micro-caching and having to maintain systems with such micro-caching sprinkled throughout is not an easy task. Nor is troubleshooting. But we digress... aiCache takes this basic idea of caching and front-ending the user traffic to your Web environment to a whole new level. I don't believe any of aiCache's features are revolutionary in nature, rather it is the sheer number of features it offers that seems to address our every imaginable need. We've also discovered that aiCache provides virtually unlimited performance, combined with incredible configuration flexibility and support for real-time reporting and alerting. In interest of space, here're some quick facts about our experience with the product, in no particular order: · Runs on any Linux distro, our standard happens to be RedHat 5, 64bit on HP DL360G5 · The responses are cached in the RAM, not on disk. No disk IO, ever (well, outside of access and error logging, but even that is configurable). No latency for cached responses - stress tests show TTFB at 0 ms. Extremely low resource utilization - aiCache servers serving in excess of 2000 req/sec are reported to be 99% idle ! Being not a trusting type, I verified the vendor's claim and stress tested these to about 25,000 req/sec per server - with load averages of about 2 (!). · We cache both GET and POST results, with query and parameter busting (selectively removing those semi-random parameters that complicate caching) · For user comments, we use response-driven expiration to refresh comment threads when a new comment is posted. · Had a chance to use site-fallback feature (where aiCache serves cached responses and shields origin servers from any traffic) to expedite service recovery · Used origin-server tagging a few times to get us out of code-deployment-gone-bad situations. · We average about 80% caching ratios across about 10 different sub-domains, with some as high as 97% cache-hit-ratio. Have already downsized a number of production Web farms, having offloaded so much traffic from origin server infrastructure, we see much lower resource utilization across Web, DB and other backend systems · Keynote reports significant improvement in response times - about 30%. · Everyone just loves real-time traffic reporting, this is a standard window on many a desktop now. You get to see req/sec, response time, number of good/bad origin servers, client and origin server connections, input and output BW and so on - all reported per cached sub-domain. Any of these can be alerted on. · We have wired up Nagios to read/chart some of aiCache extensive statistics via SNMP, pretty much everything imaginable is available as an OID. · Their CLI interface is something I like a lot too: you see the inventory of responses, can write out any response, expire responses, report responses sorted by request, size, fill time, refreshes and so on, in real time, no log crunching is required. Some commands are cluster-aware, so you only execute them on one node and they are applied across. Again, the list above is a small sample of product features that we use, there're many more that we use or explore using. Their admin guide weighs in at 140 pages (!) - and it is all hard-core technical stuff that I happen to enjoy. Some details about our network setup . We use F5 load balancers and have configured the virtual IPs to have both aiCache servers _and origin server enabled at the same time. Using F5's VIP priority feature, we direct all of the traffic to aiCache servers, as long as at least one is available, but have ability to automatically, or on demand, failover all of the traffic to origin servers. We also use a well known CDN to serve auxiliary content - Javascript, CSS and imagery. I stumbled upon the product following a Wikipedia link, requested a trial download and was up and running in no time. It probably helped that I have experience with other caching products - going back to circa 2000, using Novell ICS. But it all mostly boils down to knowing what URLs can be cached and for how long. And lastly - when you want stress test aiCache, make sure to hit it directly, right by server's IP - otherwise you will most likely melt down one or more of other network infrastructure components ! A bit about myself: an EE major, have been working with Internet infrastructures since 1992 - from an ISP in Russia (uucp over MNP-5 2400b modem seemed blazing fast back then!) to designing and running infrastructures of some of the busier sites for CNBC and NBC - cnbc.com, NBC's Olympics website and others. Rashid Karimov, Platform, CNBC.com

Click to read more ...