Entries by Nati Shalom (27)

Thursday
Apr282011

PaaS on OpenStack - Run Applications on Any Cloud, Any Time Using Any Thing

Yesterday, I had a session during the OpenStack Summit where I tried to present a more general view on how we should be thinking about PaaS in the context of OpenStack.

The key takeaway :

The main goal of PaaS is to drive productivity into the process by which we can deliver new applications.

Most of the existing PaaS solutions take a fairly extreme approach with their abstraction of the underlying infrastructure and therefore fit a fairly small number of extremely simple applications and thus miss the real promise of PaaS.

Amazon's Elastic Beanstalk took a more bottom up approach giving us better set of tradeoffs between the abstraction and control which makes it more broadly applicable to a larger set of applications.

The fact that OpenStack is opensource allows us to think differently on the things we can do at the platform layer. We can create a tighter integration between the PaaS and IaaS layers and thus come up with better set of tradeoffs into the way we drive productivity without giving up control. Specifically that means that:

  • Anyone should be able to:
    • Build their own PaaS in a snap
    • Run on any cloud (public/private)
    • Gain multi-tenancy, elasticity… Without code changes.
  • Provide a significantly higher degree of control without adding substantial complexity over our:
    • Language choice
    • Operating System
    • Middleware stack
  • Should come pre-integrated with a popular stack:
    • Spring,Tomcat, DevOps, NoSQL, Hadoop...
    • Designed to run the most demanding mission-critical app

You can read the full story and see the demo here

Monday
Apr042011

Scaling Social Ecommerce Architecture Case study

A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. (source)

Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for Sears.com that can handle complex relationship quires in real time. Tomer goes through:

  • the architectural considerations behind their solution
  • why they chose memory over disk
  • how they partitioned the data to gain scalability
  • why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework
  • how they integrated with Facebook
  • why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale

In this post I tried to summarize the main takeaway from the interview.

You can also watch the full interview (highly recomended).

Read the full story here

Wednesday
Mar092011

Productivity vs. Control tradeoffs in PaaS

Gartner published recently an interesting paper: Productivity vs. Control: Cloud Application Platforms Must Split to Win. (The paper requires registration.)

The paper does a pretty good job covering the evolution that is taking place in the PaaS market toward a more open platform and compares between the two main categories: aPaaS (essentially a PaaS running as a service) and CEAP (Cloud Enabled Application Platform) which is the  *P* out of PaaS that gives you the platform to build your own PaaS in private or public cloud.

While I was reading through the paper I felt that something continued to bother me with this definition, even though I tend to agree with the overall observation. If I follow the logic of this paper than I have to give away productivity to gain control, hmm…  that’s a hard choice.

The issue seem to be with the way we define productivity. Read the full detailes here

Friday
Jan212011

PaaS shouldn’t be built in Silos

Unlike many of the existing Platforms, in this second-generation phase, its not going to be enough to package and bundle different individual middleware services and products (Web Containers, Messaging, Data, Monitoring, Automation and Control, Provisioning) and brand them under the same name to make them look as one. (Fusion? Fabric? A rose is a rose by any other name - and in this case, it's not a rose.)

The second-generation PaaS needs to come with a holistic approach that couples all those things together and provide a complete holistic experience. By that I mean that if I add a machine into cluster, I need to see that as an increase in capacity on my entire application stack, the monitoring system needs to discover that new machine and start monitoring it without any configuration setup, the load-balancer need to add it to its pool and so forth.

Our challenge as technologists would be to move from our current siloed comfort zone. That applies not just to the way we design our application architecture but to the way we build our development teams, and the way we evaluate new technologies. Those who are going to be successful are those who are going to design and measure how well all their technology pieces work together before anything else, and who look at a solution without reverence for past designs.

full story ...

Tuesday
Nov302010

NoCAP – Part III – GigaSpaces clustering explained..

In many of the recent discussions on the design of large scale systems (a.k.a. Web Scale) it was argued that the right set of tradeoffs for building large scale systems would be to give away Consistency for Availability and Partition tolerance. Those arguments relied on the foundation of the CAP theorem developed in early 2000-2002. One of the core principals behind the CAP theorem is that you must choose two out of the three CAP properties. In many of the transactional systems giving away consistency is either impossible or yields a huge complexity in the design of those systems. In this series of posts, I've tried to suggest a different set of tradeoffs in which we could achieve scalability without compromising on consistency. I also argued that rather than choosing only two out of the three CAP properties we could choose various degrees of all three. The degrees would be determined by the most likely availability and partition tolerance scenarios in our specific application.  The suggested model was based on the experience we had in GigaSpaces over the course of the past years and was successfully deployed in many mission critical systems today in Finance, Telco and ecommerce business. I hope that through the sharing of this experience we could come up with a broader set of patterns on how to build large scale systems that would fit also to mission critical transactional systems. Read more... 

 

Tuesday
Nov092010

The Tera-Scale Effect 

In the past year, Intel issued a series of powerful chips under the new Nehalem microarchitecture, with large numbers of cores and extensive memory capacity. This new class of chips is is part of a bigger Intel initiative referred to as Tera-Scale Computing. Cisco has released their Unified Computing System (UCS) equipped with a unique extended memory and high speed network within the box, which is specifically geared to take advantage of this type of CPU architecture .

This new class of hardware has the potential to revolutionize the IT landscape as we know it.

In  this post, I want to focus primarily on the potential implications on application architecture, more specifically on the application platform landscape.  more...

Tuesday
Oct262010

Marrying memcached and NoSQL

Memcached is one of the most common In-Memory cache implementation.  It was originally developed by Danga Interactive for LiveJournal, but is now used by many other sites as a side cache to speed up read mostly operations. It gained popularity in the non-Java world, too, especially since it’s a language-neutral side cache for which few alternatives existed.  

As a side-cache, Memcache clients relies on the database as the system of record, The database is still used for write,update and complex query operations.  Since the  memcached specification includes no query operations, memcached is not a database alternative, unlike most of the NoSQL offerings. It also exclude memcache from being a real solution for write scalability. As a result of that many of the heavy sites started to move away from Memcache and replace it with other NoSQL alternatives as noted in a recent highscalability post MySQL And Memcached: End Of An Era?

The transition away from memcached to NoSQL could represent a large investment as many sites are already heavily invested in memcached usage. In this post, I'll illustrate an alternative approach in which we’ll extend the use of memcache for write scaling, add other goodies such as high availability and elasticity by plugging GigaSpaces as the backend datastore, and avoid the need for a re-write. The pure Java implementation could also be seen as a benefit as it can increase the adoption of memcached within the Java community and leverage the portability of java to other platforms. more...

Monday
Oct182010

NoCAP

In this post i wanted to spend sometime on the CAP theorem and clarify some of the confusion that i often see when people associate CAP with scalability without fully understanding the implications that comes with it and the alternative approaches

You can read the full article here

Wednesday
Sep012010

Scale-out vs Scale-up

In this post I'll cover the difference between multi-core concurrency that is often referred to as Scale-Up and distributed computing that is often referred to as Scale-Out mode. 

more..

Tuesday
Jul272010

YeSQL: An Overview of the Various Query Semantics in the Post Only-SQL World

The NoSQL movement faults the SQL query language as the source of many of the scalability issues that we face today with traditional database approach.

I think that the main reason so many people have come to see SQL as the source of all evil is the fact that, traditionally, the query language was burned into the database implementation. So by saying NoSQL you basically say "No" to the traditional non-scalable RDBMS implementations.

This view has brought on a flood of alternative query languages, each aiming to solve a different aspect that is missing in the traditional SQL query approach, such as a document model, or that provides a simpler approach, such as Key/Value query.

Most of the people I speak with seem fairly confused on this subject, and tend to use query semantics and architecture interchangeably. In Part I of this post i tried to provide quick overview of what each query term stands for in the context of the NoSQL world . Part II illustrates those ideas using  code examples from GigaSpaces and Datanucleus/Hbase.

See  Part I , Part II for more information..

Click to read more ...