Entries in General Discussion (161)

Sunday
Feb172008

Web Accelerators - snake oil or miracle remedy?

Perhaps this question is borderline off-topic but since high scalability solutions often have a global aspect I will give it a try... Have anybody had any experience with different techniques for speeding up their application to places that have a problem with poor ping response time? Ideally I would love to be running only one data center world-wide but one day I know that our sales department will sign up a customer with an unacceptable response time... Could installing a web-accelerator in front of our application extend the reach of our current data center or will we just add complexity and another source of potential errors?

Click to read more ...

Tuesday
Feb122008

Search the tags across all post

Let suppose i have table which stored tags .Now user can enter keywords and i have to search through all the records in table and find post which contain tags entered by user .user can enter more than 1 keywords. What strategy ,technique i use to search fast .There maybe more than millions records and many users are firing same query. Thanks

Click to read more ...

Tuesday
Feb122008

We want to cache a lot :) How do we go about it ?

We have a lot of dependencies to our SQL databases and we have heard that caching does help a lot as we move into scaling and providing better performance. So the question is what are some reliable software products out there that we could consider in this space ? We want to put a lot of frequently called database calls that do not change frequently into this caching layer. Also what would be an easy way to move only those database changes into the cache as opposed to reloading or pulling it into cache every few mins or hours. We need something smart that would just push changes to the caching layer as it happens. I guess we could build our own, but are there any good reliable products out there ? Please also mention how they play with regards to pricing 'cos that would be a determining factor as well. Thanks

Click to read more ...

Thursday
Feb072008

Looking for good business examples of compaines using Hadoop

I have read the blog about Mailtrust/Rackspace as well the interesting things with Google and Yahoo. Who else is using Hadoop/MapReduce to solve business problems. TIA johnmwillis.com

Click to read more ...

Tuesday
Feb052008

SLA monitoring

Hi, We're running a enterprise SaaS solution that currently holds about 700 customers with up to 50.000 users per customer (growing quickly). Our customers have SLA agreements with us that contains guaranteed uptimes, response times and other performance counters. With an increasing number of customers and traffic we find it difficult to provide our customer with actual SLA data. We could set up external probes that monitors certain parts of the application, but this is time consuming with 700 customers (we do it today for our biggest clients). We can also extract data from web logs but they are now approaching about 30-40 GB a day. What we really need is monitoring software that not only focuses on the internal performance counters but also lets us see the application from the customers viewpoint and allows us to aggregate data in different ways. Would the best approach be to develop a custom solution (for instance a distributed app that aggregates data from different logs every night and store them in a data warehouse) or are there products out there that are suitable for a high scalability environment? Any input is greatly appreciated!

Click to read more ...

Tuesday
Feb052008

Handling of Session for a site running from more than 1 data center

If using a DB to store session(used by some app server, ex.. websphere), how would an enterprise class site that is housed in 2 different data centers(that are live/live) maintain the session between both data centers. The problem as I see it is that since each data center has their own session database, if I was to flip the users to only access Data Center 1(by changing the DNS records for the site or some other Load balancing technique) then that would cause all previous Data Center 2 users to lose their session. What would be some pure hardware based solutions to this that are being used now? That way the applications supporting the web site can be abstracted from this. As I see now, a solution is to possibly have the session databases in both centers some how replicate the data to each other. I just don't see the best way to even accomplish this you are not even guraunteed that the session ID's will be unique since it's 2 different Application Server tiers(again websphere). Not to mention if the 2 data centers are some distance apart this could be difficult to accomplish as well.

Click to read more ...

Monday
Feb042008

IPS/IDS for heavy content site

All, My site would have heavy content (video/pictures). I'm looking for an efficient IPS/IDS solution which would not introduce much of latency. I'm more familiar with Cisco ASA and also familiar with Juniper, Foundry and others. I also came across snort but haven't used it before. I'm more of looking for an appliance (for the ease of configuration,support etc...) Could any one share their thoughts on performane of IPS/IDS from this vendors? Thanks! Janakan Rajendran

Click to read more ...

Monday
Feb042008

Streaming Video on Amazon EC2?

An Amazon EC2 Flash Video Streaming solution has been announced by Wowza Media. What do you think about the future of similar solutions? Is Amazon EC2 and S3 ready for video streaming? I have found threads on their forums related to the performance, scalability and high availability of the hosted streaming solution. How would you make it scalable? Is it really cheaper than traditional hosting? Looking forward to your thoughts!

Click to read more ...

Sunday
Feb032008

Ideas on how to scale a shared inventory database???

We have a database today that holds all of our shared inventory. How do we scale out ? We run into concurrency issues today as mutliple users may want to access the same inventory,etc. Im sure its a common problem.. So how do folks implement this while also having faster response to available inventory and also ensuring no downtime Thanks

Click to read more ...

Saturday
Feb022008

The case against ORM Frameworks in High Scalability Architectures

Let me begin by saying that I have used and continue to use various ORM frameworks such as hibernate, ibatis, propel and activerecord in applications and websites that have a user base ranging from a couple hundred to 500k users. Especially for projects that have to be up and running in a short duration of time, ORM frameworks significantly reduce the effort required to manipulate and persist OOP objects by providing time saving facilities such as automatically generated model objects, integrated unit testing, secure variable substitution, etc. Hibernate even supports horizontal data partitioning via Hibernate Shards. However, the lay of the land is significantly different in the rarefied space occupied by applications needing to support millions of users. Profiling an application at this level and paying particular attention to the operations needed to move data to and from the database, it becomes evident that a significant portion of the operations are API related, whereby the ORM framework is traversing the abstraction layer built between the application logic and the native methods that ultimately interact with the database. I see a couple of problems with this level of abstraction and for the purpose of this discussion, I will purposely ignore caching for the sake of keeping the scope succinct. 1. The process of optimizing database queries is as much an art as it is a science and I am yet to see an ORM framework that does this well. In the case of mysql, optimization involves using facilities such as explain, benchmark, analyze table, show index, and the slow queries log to identify non-performing queries and tweak them to extract the leanest performance. These optimizations necessarily work best when applied as close as possible to the bare metal, so to speak, and the abstraction of an ORM framework negates to an extent the benefits of optimization. The devil remains in the details and the further away you are from the details, the lesser a chance you have to find and square with the devil. 2. At the end of the day, an ORM framework is essentially middleware. My reading of some of the real life architectures presented on this sites seems to reinforce the assessment that middleware will only take you so far, beyond which you have to roll your own. This makes perfect sense. ORM frameworks are built to serve as wide an audience as possible and while their success is unquestionable in the commodity/middle market, they are not and cannot possibly be tooled to accommodate the atypical demands of high scalability architecture. That would be akin to running with hares and hunting with the hounds. Building a framework for hight scalability would also require that the builders have a front and center seat in an enterprise where they are exposed to the machinery and day to day operations of a high scalability site. A situation for which you would be hard pressed to find another installation bearing similar characteristics or with similar requirements. Additionally, and without putting down the developers who contribute to these frameworks, a majority of them would not have the exposure to a bona fide high scalability architecture to be able to bring their experience to bear on the framework code base. 3. Just as with kernel developers, I have a significant amount of faith in the folks that spend their every waking hour coding database engines such as MySQL, Postgres, Oracle, MS SQL etc. Consequently, when the main goal is ultimate performance and scalability, I generally frown upon efforts to introduce a middle man between the wicked fast database and the application logic. And having invested the time and effort over many years to learn the intricacies of a database engine, I am more apt to cast my lot with the devil that I know than abdicate control to a framework, however versatile. One could argue that it makes sense to start off with an ORM framework and as the demands for the site begin to eclipse what the framework can provide, gradually transition to a custom built solution. In my experience, refactoring on the database tier for a site that has a significant amount of data and needs to be operational 24x7 is pure hell. So much so that a more feasible option would be to build a parallel site then migrate and switch over. Of course this could be mitigated by using a service oriented architecture and thereby giving yourself some degree of maneuverability, but at the end of the day, there will be thousands of operations trying to read and write to the db every second. You are had, whichever which way you turn. Taking a look at the mediawiki source code that powers the Wikimedia sites including Wikipedia, there are two classes, DatabaseMySQL and DatabasePostgress which encapsulate the native PHP functions that talk to MySQL or PostgreSQL respectively. The other main classes such as the Article class then use these database classes to interact with the db. Simple and straight forward and in my opinion, the best way to get maximum performance and throughput.

Click to read more ...

Page 1 ... 7 8 9 10 11 ... 17 Next 10 Entries »