ScaleAholic

Wednesday, October 6, 2010

A Couple Minutes With Ehcache BigMemory Pounder...

Introduction

UPDATE: This product was released (No longer in beta)

Ehcache with BigMemory is now out in beta. One of the challenges when confronted with a new technology is exercising and understanding it's characteristics. Sure, you can read the docs, Google for some blogs, integrate it with your application and maybe write some samples, but that's a lot of work. Plus those approaches may not give you a clear picture of the various characteristics of the software. In order to make things a bit easier I've released a configurable pounder application for Standalone Ehcache with BigMemory.

Getting Installed

Here are the steps to get started:

Get the Ehcache with BigMemory Beta and a license key to use it.
Get the Standalone Ehcache Pounder distribution
Unpack the Ehcache with BigMemory distribution

tar -xzvf ehcache-core-ee-2.3-distribution.tar.gz

Copy the Standalone Ehcache Pounder kit into the ehcache kit and unpack it

cd ehcache-core-ee-2.3

cp ../ehcache-pounder-0.0.5-SNAPSHOT-distribution.tar.gz .

tar -xzvf ehcache-pounder-0.0.5-SNAPSHOT-distribution.tar.gz

Copy your license file and your ehcache core jar into the pounder kit

cd ehcache-pounder-0.0.5-SNAPSHOT

cp ../lib/ehcache-core-ee-2.3.jar .

and copy terracotta-license.key to the ehcache-pounder-0.0.5-SNAPSHOT as well.

Running the Pounder

Now you are ready to go. First take a look at the start script in one of the template directories (i.e. templates/1G-BigMemory):

sh run-pounder.sh

Out of the box it looks like this:

java -verbose:gc -Xms200m -Xmx200m -XX:+UseCompressedOops -XX:MaxDirectMemorySize=64G -cp ".:./jyaml-1.3.jar:./ehcache-pounder-0.0.5-SNAPSHOT.jar:./ehcache-core-ee-2.3.jar:slf4j-api-1.5.11.jar:slf4j-jdk14-1.5.11.jar" org.sharrissf.ehcache.tools.EhcachePounder

This should work fine for most people. It uses a small heap of 200 meg (you may need to grow this for really big caches). It is fine to have a MaxDirectMemorySize that is larger than your memory size on your machine. Just don't have a maxOffHeap size in your config.yml that is greater than your available physical memory on your machine. Also, make sure you leave room for your OS and the JVM. For example. If you have 8G of physical memory you might do a 400m java heap, a 6G offheap and leave the rest for the OS to use.

The script defaults to verbose gc because it's useful to see those stats and compressed oops BECAUSE YOU SHOULD BE USING A 64 BIT JVM and this setting makes it much more efficient (closer to 32 bit object pointers when possible using much less heap).

I tend to run this script like this:

./run-pounder.sh | tee out.txt

That way all output of the test goes to both the screen and to the file out.txt

Configuring the Pounder

You configure the pounder using the config.yml file. You can learn about all the options in the README. Here is the sample config included in the kit:

storeType: OFFHEAP

threadCount: 33

entryCount: 1000000

offHeapSize: "1G"

maxOnHeapCount: 5000

batchCount: 50000

maxValueSize: 800

minValueSize: 200

hotSetPercentage: 99

rounds: 40

updatePercentage: 10

diskStorePath: /export1/dev

NOTE: The most important setting here is the offHeapSize. If you set this to a number greater than the amount of memory you have on your machine you will not be happy.

The offHeapSize + the heap size + the amount of memory your OS needs together must be less than the amount of physical available memory on the machine.

Output

When running the output will look like the following:

size:492120 time: 7434 Max batch time millis: warmup value size:796 READ: 0 WRITE: 15151 Hotset: 99

size = total size of cache

time = total time to execute batch

max batch time = either warm up for the load phase or the longest time it took to execute a batch (note that in a multi-threaded test on a cpu bound machine the batch times are going to be impacted.)

value size = size of the value in this batch

READ = number of reads in the batch

WRITE = number of writes in the batch

hotset = percentage of the time that reads are done from the on heap cache

At the end of each round you'll see something like this:

Took: 10899 final size was 995240 TPS: 91751 MAX GET TIME: 20

Took = total time the round took

final size = the total size of the cache at the end (this can be impacted by eviction and be less than what you loaded)

TPS = total TPS during the run

MAX GET TIME = The maximum amount of time it took to get an entry

An interesting design choice of this pounder is that the threads you define do BOTH reading and writing. So the writers can starve out the readers in a hotset test. I could have gone he other way and then the readers could starve out the writers giving overly generous TPS.

What's Next

The biggest gap right now is that I didn't implement a way to specify a rate. It always goes full throttle. I'll try to add that when I get some time if people think it's useful.

Please give me lots of feedback so we can improve both the pounder and Ehcache

Learn more at:

The source for the pounder

http://terracotta.org/bigm emory

My other blog on BigMemory
Check the Ehcache BigMemory docs

Server Array BigMemory FAQ

Wednesday, September 15, 2010

A Little Bit About BigMemory for Ehcache and Terracotta ...

Big Memory?

In talking to our users it is clear that applications are getting more and more data hungry. According to IDC, data requirements are growing at an annual rate of 60 percent. This trend is driven further by cloud computing platforms, company consolidation and huge application platforms like Facebook. There is good news though. Server class machines purchased this year have a minimum of 8 Gig of RAM and likely have 32 Gig. Cisco is now selling mainstream UCS boxes with over 380 Gig of RAM (which I have tried and is amazing). On EC2 you can borrow 68.4 Gig machines for 2 dollars an hour (I have also tried this and it is also pretty amazing). Memory has gotten big and extremely cheap compared to things like developer time and user satisfaction.

Unfortunately a problem exists as well. For Java/JVM applications it is becoming an ever increasing challenge to use all that data and memory. At the same time that the data / memory explosion is occurring the amount of heap a Java process can effectively use has stayed largely unchanged. This is due to the ever increasing Garbage Collection pauses that occur as a Java heap gets large. We see this issue at our customers but we also see here at Terracotta tuning our products and the products we use like third party app servers, bug tracking systems CMS's and the like. How many times have you heard "run lots of JVM's" or "don't grow the heap" from your vendor's and/or devs?

So we set out to first identify the problem as it exists today, both in the wild and in-house. We then created a solution, first for us (an internal customer) and then for all of the millions of nodes of Ehcache out there (all of you)

3 Big Problems Seen by Java Applications

My Application is too slow

My application can't keep up with my users. I've got 10's of gigs of data in my database but it's over loaded and or too slow to service my needs. Either due to the complicated nature of my queriers or the volume of those queries. I want my data closer to the application so of course I start caching. Caching helps, but I want to cache more. My machine has 16 gigs of RAM but if I grow my heap that large, I get too many Java GC pauses.

My Application's latencies aren't predictable

On average my Java application is plenty fast but I see pauses that are unacceptable to my users. I can't meet my SLA's due to the size of my heap combined with Java GC pauses.

My software/deployment is too complicated

I've solved the Java GC problem. I run with many JVM's with heap sizes of 1-2 gigs. I partition my data and or loadbalance to get the performance and availability I need but my setup is complicated to manage because I need so many JVM's and I need to make sure the right data is in the right places. I fill up all 64 Gig's of RAM on my machine but it's too hard and fragile.

The other problem

Like many vendors, in the past we told our users to keep the heaps down under 6 gig. This forced our customers to not completely leverage the memory and or cpu on the machines they purchased and or stack JVM's on a machine. The prior is expensive and inefficient and the latter fragile and complex.

Here is a quick picture of what people do with their Java Applications today:

Base Case - Small heap JVM on a big machine because GC pauses are a problem
Big heap - That has long GC's that are complicated to manage
Stacked small JVM heaps - This in combination with various sharding, load balancing and clustering techniques is often used. This is complicated to manage and if all the nodes GC at the same time this can lead to availability problems.

What kind of solution would help?

Here's what we believe are the requirements for a stand-alone caching solution that attacks the above problems.

Hold a large dataset in memory without impacting GC (10s-100s of Gig) - The more data that is cached the less you have to go to your external data source and or disk the faster the app goes
Be Fast - needs to meet the SLA
Stay Fast - Don't fragment, don't slowdown as the data is changed over time
Concurrent - Scales with cpu and threads. No lock contention
Predictable - can't have pauses if I want to make my SLA
Needs to be 100 percent Java, work on your JVM on your OS
Restartable - A big cache like this needs to be restartable because it takes too long to build
Should just Snap-in and work - not a lot of complexity

What have we built?

First we built a core piece of technology, BigMemory, an off-heap, direct memory buffer store, with a highly optimized memory manager that meets and or exceeds requirements 1-6 above. This piece of technology is currently being applied in two ways:

1) Terracotta Server Array - We sold it to our built-in customer, the Terracotta Server Team, who can now create individual nodes of our L2 caches that can hold a hundred million entries, leverage 10's of gigs of memory, pause free and with linear TPS. This leverages entire machines (even big ones) with a single JVM for higher availability, a simpler deployment model, 8x improved density and rock steady latencies.

2) Ehcache - We've added BigMemory and a new disk store to Enterprise Ehcache to create a new tiered store adding in requirements 7-8 from above (snap-in simplicity and restart-ability). The Ehcache world at large can benefit from this store just as much as the Terracotta products do.

Check out the diagram below.

Typically, using either of the BigMemory backed products, you shrink your heap and grow your cache. By doing so SLA's are easier to meet because GC pauses pretty much go away and you are able to keep a huge chunk of data in memory.

Summing up

Memory is cheap and growing. Data is important and growing just as fast. Java's GC pauses are preventing applications from keeping up with your hardware. So do what every other layer of your software and hardware stack does: cache. But in Java, the large heaps needed to hold your cache can hurt performance due to GC pauses. So use a tiered cache with BigMemory that leverages your whole machine and keeps your data as close to where it is needed as possible. That's what Terracotta is doing for it's products. Do so simply, i.e. snap it in to Ehcache and have large caches without the pauses caused by GC. As a result create a simpler architecture with improved performance/density and better SLA's.

Learn more at http://terracotta.org/bigmemory
Check the Ehcache BigMemory docs

Server Array BigMemory FAQ

Wednesday, July 28, 2010

Application Scale and Quartz "Where"

I don't think it is to controversial to say that Quartz Scheduler is by far the dominant Open-Source Java job scheduler. It provides fast, flexible, and extremely reliable job execution and it is embedded in just about everything out there. In the last 6 months or so we have added a new Terracotta backed version of Quartz, lots of bug fixes and a GUI for monitoring quartz when clustered with Terracotta. We have a lot more coming. The team is hard at work on what's next. My favorite feature on the Quartz 2.0 list, the one they are working on right now, is Job Location Control or what is code named "Where".

Some of the trends we are seeing in our user base include the leveraging of EC2, larger apps/user bases, demanding HA requirements and hardware farms. As a result, scaled-out architectures are becoming common place in many IT environments. Scheduling work in these large multi-node environments is becoming a part of many software developers lives. Currently, Quartz supports scale-out but with little control over how the work is distributed. While this is a good start one quickly runs into problems like assigning jobs to machines that have the processing power to the work on them at the time the job is fired. Or execute the job where the data is local. Or just perform certain jobs on certain classes of machines do to their location or by purpose.

Well that is "where" we are headed. In the next few months the Quartz guys working with the Terracotta Guys and the Ehcache guys are developing a solution to the above set of problems. We'll be giving the same flexible and reliable scaled out scheduler but adding a new level of control. We are adding the "Where"

Stay tuned...

Thursday, July 22, 2010

Hiring at Terracotta - Performance/Testing Engineer and or Lead

Working at Terracotta is just about the best job someone could want (my humble opinion). Fast paced, super smart people, lots of interesting problems and widely adopted products wrapped up in a nice little package known as a fast growing startup. So send your resume now!

These positions can be either in San Francisco or Noida India.

About Terracotta:

Terracotta is the a fast growing company behind the most widely used software for application scalability, availability and performance. Our software is deployed in more than 250,000 enterprise installations, including the majority of the Fortune 2000.

Snap-in Scale and Performance

Terracotta's software products provide snap-in performance and scale for enterprise applications. With a simple change in two lines of configuration, Terracotta customers can run enterprise applications 10x faster and scale them―from one node, to 1000s, even to the cloud―without re-writing code or compromising performance or reliability.

A Leader in Distributed Caching

In “The Forrester Wave™: Elastic Caching Platforms, Q2 2010,” Forrester named Terracotta a Leader in this emerging market and ranked us strongest in strategy among eight elastic caching platforms.

http://www.terracotta.org

Where we are:

Our main headquarters are in San Francisco, CA in the SOMA area. We have an office in Noida, India and we have super star developers all over the world.

Description for LEAD QA Engineer

At Terracotta quality and stability in our product are our primary

objectives. Join our highly motivated, fast paced, agile, quality driven development team

where you will have many great opportunities to make an impact on

product capabilities and success in your role as QA Lead.

As a valued member of our tech lead team you will:

* Develop, maintain, and enhance both unit testing and performance

testing frameworks

* Design test strategies, develop test tools and implement test cases to

ensure highest quality deliverables for maintenace and new feature releases

* Improve the overall productivity of all of your co-workers by

identifying tools and processes to increase overall efficiency

* Create and maintain functional, performance, stress and endurance tests

* Diagnose and debug issues in a production environment

* Work closely with Engineering to understand the Product Architecture

and work on identifying, designing or enhancing existing test frameworks

to support backend test development

* Mentor and manage QA Engineers in a distributed team

Qualifications

* Proven track record as a lead in development and/or QA

* Motivated to improve existing processes, test strategies

* Strong knowledge of Java or other related programming languages

* Strong Knowledge in at least one scripting language such as Perl or Ruby

* Experience in creating back end test frameworks

* Ability to work independently to triage issues and prioritize tasks

* Strong understanding of QA Process

* Strong communication skills (verbal and written)

* Experience with code coverage and test tool development

* Experience with UNIX

* Experience with distributed caches, high availability products, and/or NoSQL solutions

* Experience with common java frameworks and containers such as Spring, Jetty, Hibernate, Ehcache, Quartz

* Ability to focus on multiple projects while in differing SDLC phases

If that's you and you meet most of the below criteria send us your resume careers@terracottatech.com.

Tuesday, July 20, 2010

Hiring at Terracotta...

About Terracotta:

Snap-in Scale and Performance

A Leader in Distributed Caching

http://www.terracotta.org

Where we are:

Our main headquarters are in San Francisco, CA in the SOMA area. We have super star developers all over the world.

Who we are looking for:

Have you written or worked on a distributed system, messaging system or NoSQL solution? Do you love cool, hard problems? Are you good at design API's that work well in todays Java Applications? Can you work both in a group and on your own?

Qualifications:

Works hard and solves hard problems
Strong OO/Framework design sense
Understand the Java Landscape in a deep way (i.e. Spring, J2EE, Ehcache, Quartz, Rest, SOAP, NoSQL)
Excited about performance, caching and scale-out
Love to code and write tests
Works well in a team and individually
Believes that the only way to know if something works is to test it in a repeatable way

Nice to have:

Experience with open-source
Live in/near San Francisco
Tech Lead experience

Responsibilities:

Design and build the next generation of scale-out, performance and HA software.
Contribute to and extend Terracotta, Ehcache and/or Quartz Scheduler, some of the most popular and widely used frameworks in Java

If that's you and you meet most of the below criteria send us your resume careers@terracottatech.com.

Thursday, June 24, 2010

A Couple Minutes With Some Toolkit Samples

Last night we released Beta 2 of the Terracotta 3.3 release. It has a number of improvements and updates to the Terracotta Toolkit including improved naming and factoring and additional clustered classes like AtomicLong and List. You can grab that or to get the absolute latest you can grab a nightly (my preference because the beta doesn't work with maven)

I pushed a couple of toolkit samples to GitHub (which by the way is awesome!). I didn't do hardly any cleanup or comments so just post questions to the blog if you need help.

http://github.com/sharrissf/Terracotta-Tools-and-Samples/

If your a Git user already I don't need to tell you how to check it out. If your not just hit "download source" on the upper right of the web page and don't worry about Git.

Four examples are included:

PlayingWithMapOfLocksExpress - A quick sample that show's how to create a clustered Map of Locks.

PlayingWithToolkitBarrier - Example of using a Cyclic barrier to coordinate between processes

PlayingWithToolkitClusterInfo - Example showing how to register a listener for cluster events.

PlayingWithToolkitQueue - Little sample on using a queue between two nodes

PlayingWithToolkitClusterCounter - Basic sample that show's using a clustered atomic long

You can run these from the command line by:

Downloading and unpacking the nightly build or beta from Terracotta
Staring the Terracotta server by calling the ./bin/start-tc-server.sh
Running 2 instances of the compiled versions of any of the above

java -cp target:INSTALL_DIR/common/terracotta-toolkit-1.0-runtime-ee-1.0.0-SNAPSHOT.jar PlayingWithToolkitClusterInfo

You can also run them using the Terracotta Maven Plugin:

mvn tc:run

To switch between the samples edit the pom.xml

Wednesday, June 9, 2010

A Couple Minutes With The Terracotta 3.3 Beta

With Terracotta platform version 3.3 our goal of creating an accessible application performance and scale-out solution is taking another big step forward. It's important to us that developers can find success using the ubiquitous Ehcache for performance, Hibernate for an ORM, Http Web Sessions for user state and Quartz for scheduling in a single node application. Then, without much thought or effort, add a couple lines of config to achieve scale-out and HA serving needs all the way through enterprise apps and into the cloud.

We have focused on a couple of high level areas in this release to take our solution to the next level.

Simple Scale - Reduce the need for tuning and tweaking with an improved next gen datastore. It will allow the everyday user to achieve the kinds of scale needed for massive applications both in data size and number of nodes.

Improved Visibility - We have added panels for Quartz and Sessions to our developer console giving full visibility to the full suite of performance and scale-out products. We have also added a more product focused organization of information in the tool.

Simple HA - A new panel that makes it easier to monitor interesting events that occur in a cluster. Pre-built templates for various configurations. Simplified way of migrating nodes, better defaults.

Modularity - We have exposed some of our most powerful pieces and parts as a versioned standard API that can be simply coded against to get things like, locking, queuing, maps and cluster topology. We use this API to build all four of our core products (Ehcache, Quartz, Hibernate 2nd level cache, Web Sessions).

While not everything made into the beta, enough is there to get a taste of where we are going. I encourage people to download it and try it as soon as possible. GA is just around the corner. Learn more on the release notes page.