High Scalability -

Entries in General Discussion (161)

Monday

Dec222008

SLAs in the SaaS space

Monday, December 22, 2008 at 10:11AM

This may be a bit higher level then the general discussion here, but I think this is an important issue in how it relates to reliability and uptime. What kind of SLAs should we be expecting from SaaS services and platforms (e.g. AWS, Google App Engine, Google Premium Apps, salesforce.com, etc.)? Up to today, most SaaS services either have no SLAs or offer very weak penalties. What will it take to get these services up to the point where they can offer the SLAs that users (and more importantly, businesses) require? I presume most of the members here want to see more movement into the cloud and to SaaS services, and I'm thinking that until we see more substantial SLA guarantees, most businesses will continue to shy away as long as they can. Would love to hear what others think. Or am I totally off base?

Click to read more ...

lennysan |

2 Comments |

Permalink |

Print Article

Email Article

General Discussion,

SAAS,

SLA,

transparency,

uptime

Friday

Dec192008

How to measure memory required for a user session

Friday, December 19, 2008 at 8:59AM

hi, What are the practices followed, tools used to measure session memory requirement per user? Thanks, Unmesh

Click to read more ...

unmesh |

3 Comments |

Permalink |

Print Article

Email Article

General Discussion

Sunday

Nov302008

Creating a high-performing online database

Sunday, November 30, 2008 at 1:34PM

Hi there, I have an idea for an online database that services a large number of people. I've been studying it for a while and it seems feasible to me to create it and get people to populate it. It will need time to grow but eventually it will get there. The model I'm looking at is IMDB, the depth of information is fascinating, yet it's fast, not so easy to use though, but it's pretty usable! What do you think I need to create a database an online database like IMDB. I know that IMDB power comes from it's information, not the design of the site. This is something I kind of figured out. But what I need to know is the best tools to publish database contents on the web, retrieve it in that fast way like IMDB. I'm sure that I will need to create data entry logs for my users to populate the database. What programming languages you suggest? development environment? approaches? your contribution is highly appreciated. Regards, Jalil

Click to read more ...

Jalil |

7 Comments |

Permalink |

Print Article

Email Article

General Discussion,

database performance online website

Wednesday

Nov192008

High Definition Video Delivery on the Web?

Wednesday, November 19, 2008 at 2:45AM

How would you architect and implement an SD and HD internet video delivery system such as the BBC iPlayer or Recast Digital's RDV1. What do you need to consider on top of the Lessons Learned section in the YouTube Architecture post? How is it possible to compete with the big players like Google? Can you just use a CDN and scale efficiently? Would Amazon's cloud services be a viable platform for high-definition video streaming?

Click to read more ...

geekr |

9 Comments |

Permalink |

google,

video,

youtube

Tuesday

Nov112008

Arhcitecture for content management

Tuesday, November 11, 2008 at 9:49AM

Hi, I am looking for logical architecture of content management of portal. Say an org has got lot of business process and integrates with few applicaitons and it is portal based application. How does it look to have architecture framework for this type of fucntionality.

Click to read more ...

gvbsvv |

6 Comments |

Permalink |

portal

Thursday

Oct302008

The case for functional decomposition

Thursday, October 30, 2008 at 2:59AM

Hi all, I'm a big fan of http://highscalability.com/ and have been looking in my current development to decompose my application along functional boundaries as a route to being able to scale out the server side, specifically the database layer. The problem comes when there are links between the data in different components, ie one component holds all the user data, but another component needs to reference a user as being an owner of some piece of data. I'm currently doing this by holding the primary key information for each side of the the link (as you would if they all lived in a single database), but this link table needs to exist in both components to allow lookups to be done in either direction, ie 'get the things a specific user owns' and 'get the owners of this specific thing' would each use different components. The alternative to this would be to store the link data in only one of the components, but then the reverse lookups would require 2 calls instead of just one. My question is this, is the duplication of these link tables some kind of code smell I should be avoiding or is this just the way things go when you split your app along functional lines like this? Is this sort of approach really applicable to anyone other than the ebays of this world? should the rest of us just keep putting more functionality into the same back end? Cheers, Robin

Click to read more ...

robin |

4 Comments |

Permalink |

Print Article

Email Article

General Discussion,

functional decomposition

Tuesday

Sep162008

EE-Appserver Clustering OR Terracota OR Coherence OR something else?

Tuesday, September 16, 2008 at 3:01AM

Hi, I am very glad that this site exists, as I have learned more about clustering on this site than for quite some time reading stuff elsewhere. Oftentimes, one can find lots of material about clustering, but the practical real-life information is missing. Not so wih this site. I am currently planning the development of an application which has a lot of enterprise features and requirements. On the other side (if the tiny chance of success might strike us), this application would not be an in-house application of a financial institution, or something like that, but some kind of communit/web 2.0 web site. Thus it is an enterprise application with (hopefully, but surely unlikely) the user numbers of a social networking site. Each user initiated transaction involves huge resssources business logic wise (including insane amounts of encryption oprations). Of course, I do not intend to induldge into premature scaling, but to invest every minute I have into the implementation of business logic features. Nevertheless, I do not want to make some extremely bad choices which would force a complete reimplementation straight after the first tiny success - i.e. I want to start with the right technology and architecture, but wait with the implementation of the scalability and high availyability features. Because of the enterprise aspects of this software, my first thought was to use Java SE 6 and Java EE 5 technologies only in order to get all the JEE features and to be vendor independent at the same time. For implementation and testing purposes I thought of Glassfish v2UR2, Postgresql 8.3 and Solaris 10. As all of the major JEE-Appserver vendors advertise the clustering capabilities, I thought that this could not be a bad move. Hopefully, Glassfish would provide HA and scalability, if not there would always be Geronimo, JBoss, Weblogic, or Websphere. Now it seems that there are vast differences between different products: - JEE-Application servers are scaling only to some degree(?). It seems that JEE is almost exclusively used for enterprise applications like SAP ERP or applications at financial institutions? Therefore, there is no need for extreme scalability. - Terracotta seems to be very nice, as one do not have to learn the insanely huge JEE-technology stack, but can just write a mostly Java-SE-only threaded application(?). But Terracotta does not seem to scale very well either (bottleneck with write-operations caused by the master-worker architecture?) and we would be dependend on the future of the Terracotta Corporation. JEE on the other side is vendor neutral. - Oracle Coherence. This product seems to be the best distributed caching product and the holy grail of scalability(?). But it is oracle-expensive. Absolutely nothing for a tiny start-up with no financing. JEE is vendor neutral and thus possibly much cheaper. Do you think that it is possible that one could produce a JEE-Architecture which could provide massive scalability (many hundreds of AppServer) using only the Glassfish clustering features? Or am I on a completely wrong track? Do we have to plan for Oracle Coherence usage? Are there other possibilities? Thanks a lot for any opinions or hints! regards, mike

Click to read more ...

mike934 |

7 Comments |

Permalink |

Java,

enterprise

Wednesday

Sep102008

Shard servers -- go big or small?

Wednesday, September 10, 2008 at 1:48PM

Hello everyone, I'm designing a website/widget that my business partner and I expect to serve millions of hits daily. As such we must shard our database (and we're designing with shards in mind right from the beginning). However, the one thing I haven't been able to figure out from Googling is the best hardware to go with for shards. I'm using exclusively InnoDB tables. We'll (eventually) be running 3 groups of database servers: a) Session servers for php sessions. These will have a very high write volume. b) ID servers. These will match a couple primary indices (such as user ID) to a given shard. These will have an intense read load, plus a moderate amount of writes. c) Shard servers. These will hold the bulk of the data. These will have a high read load and a lowish write load. Group A is done as a database instead of using memcached so users aren't logged out if a memcached server goes down. As the write load is high, a pair of high performance master-master servers seems obvious. What's the ideal hardware setup for machines with this role? Maxed RAM and fast disks seem reasonable. Should I bother with RAID > 0 if I have a live backup on the other master? I hear 4 cores is optimal for InnoDB -- recommendations? Group B. Again, it looks like maxed RAM is recommended here. What about disks? Should I go for 10K or will regular SATA2 drives be okay? RAID 0, 5, 10? Cores? Should I think about slaves to a master-master setup? Group C. It seems to me these machines can be of any capacity because the data they hold is easily spread between shards. What is the query-per-second per dollar sweet spot when it comes to cores and number of disks? Should I beef these machines up, or stick with low end hardware? Should I still max the RAM? I have some other thoughts on system setup, too. As the data stored in the PHP sessions won't change frequently (it'll likely remain static for a user's entire visit -- all variable data can be stored in Group C shard servers), I'm thinking of using a memcached setup in front of the database and only pushing writes through to the database when necessary. Your thoughts? We're also starting this on a minimal budget (of course), so where in the above is it best spent? Keep in mind that I can recycle machines used in Group A & B in Group C as times goes on. Anyway, I'd love to hear from the expertise of the forum. I've been reading for a long time, and I'll be writing as our project evolves :) --Mark

Click to read more ...

Mark Rose |

2 Comments |

Permalink |

Print Article

Email Article

General Discussion,

MySQL,

Shard,

hardware,

innodb

Thursday

Sep042008

Database question for upcoming project

Thursday, September 4, 2008 at 3:06AM

We will be developing an RIA that will have a lot of database access. Think something like a QuickBooks but with about 50 transactions entered per hour per user. Users will be in the system for 7 to 9 hours a day and there will be around 20,000 users, all logged in at the same time. Reporting will be done just like a QuickBooks style app plus a lot of extra things you don't do in QuickBooks. Our operations is familiar with W2003 Server and MS SQL Server so they are recommending we stick with that. I originally requested Linux and PostgreSQL. How far can a single database server get me? If we have a 4 processor, 8 core, 128gb server, how far am I going to get before I need to shard or do something else? I know there are a lot of factors involved but in general for this size of a site, what should the strategy be? I've read almost all articles on this website but most of the applications are not RIA type of apps with this type of usage or they are architectures for sites with millions of users which we also won't have.

Click to read more ...

mbinette |

2 Comments |

Permalink |

Monday

Sep012008

A Scalability checklist?

Monday, September 1, 2008 at 12:02AM

Hi everyone, I'm researching on Scalability for a college paper, and found this site great, but it has too many tips, articles and the like, but I can't see a hierarchical organization of subjects, I would need something like a checklist of things or fields, or technologies to take into account when assesing scalability. So far I've identified these: - Hardware scalability: - scale out - scale up - Cache What types of cache are there? app-level, os-level, network-level, I/O-level? - Load Balancing - DB Clustering Am I missing something important? (I'm sure I am) I don't expect you to give a lecture here, but maybe point some things out, give me some useful links... Thanks!

Click to read more ...

www.petruza.com.ar |

1 Comment |

Permalink |

Print Article

Email Article

General Discussion,

scalability checklist