ScaleAholic: 2010

Wednesday, December 15, 2010

Ehcache To The Rescue (Comic Strip)

Working on a product that has as much adoption as Ehcache is extremely rewarding. Every day millions of people's applications run faster with this tiny little library. It's one of those, "Of course I use it, doesn't everyone " kinds of products and what Greg Luck has accomplished with it is really amazing. One of the challenges we face as we improve and extend it's functionality is getting the story out to those million or so users. Yes, it's still an extremely fast extremely light weight library that pretty much everyone just uses. But it also future proofs your app with snap-in scale-up, scale-out and HA each with just 2 lines of configuration.

In the coming release we are adding Search, Local Transactions and an even better BigMemory (We've been regularly testing with 2 billion entries in memory with linear, predictable, fast, performance but that's another blog) and we want people to know about this stuff, try it, and give feedback so we can constantly make it better.

Anyway, after watching an inspiring talk from Edward Tufte on visual representation I've been experimenting with ideas on how to teach people about Ehcache in quick, fun ways. In the process I wrote my very first comic strip. Check it out...

Ehcache To The Rescue Comic Strip...

Monday, November 29, 2010

Quartz Scheduler 2.0 Beta 1 Welcomes New Fluent API and "Where"

Quartz Scheduler, the most widely used Java scheduler is getting some major improvements for 2.0. I'm going to talk a little about what to expect but you can also read the release notes here.

Goals

We had two major goals when planning for Quartz 2.0 began.

Simplify/modernize the Quartz API.
Improve the Quartz experience when leveraging a cluster

Quart 2.0 API

In order to improve the usability and readability of Quartz 2.0 over passed versions we spent a lot of time evaluating the parts and how a user interacts with them. We found that too much knowledge of those parts needed to be understood at construction time. As a result we moved to a "Fluent API/DSL" approach. The best way to get a feel for the kind of improvement this gives is to compare the samples from 1.8.4 to the same ones translated in 2.0 Beta 1.

In the simple of case of example 1 from the Quartz Kit you can see the basic philosophy change:

Improvements include:

The date/time related methods have been moved off of the Trigger and Job classes into a Date building class called "DateBuilder"
We've removed the need to know details about which Job and Trigger classes you need and instead infer them through the building methods you call.
The construction now reads more like a sentence. new job withIdentity "job1", "group1". new trigger withIdentity "trigger1", "group1" start at runTime

This is pretty subtle in a simple case like example 1 but gets more obvious as the cases get more complex. Let's look at example 2.

In this case notice how the constructors are growing with no real indication to what each parameter means.

Now lets look at an example where you have to choose a specific Trigger type.

This example shows how in Quartz 1.8.4 you would have to know to select a different Trigger type but in 2.0 it's just abstracted away in the scheduling.

It's worth going through the samples yourself and making any suggestions you may have. Still plenty of time to get your API suggestions in via the Quartz Forum!

Quartz "Where" And Other Improvements

Clustered Quartz has gotten two major improvements in 2.0. Improved performance as node count increases and Quartz "Where"

I wrote a blog about this a while back but now you can play with it. Alex Snaps has written an excellent blog with code samples on the topic. Check it out to really dig in.

Ehcache 2.4 Beta 1 Welcomes Search, Local Transactions and more...

Ehcache 2.4 Beta has been released. We've been spending the last few years working hard to make Ehcache the best possible solution for users with caching needs. 2.4 beta 1 is another big step forward. The beta includes some significant new features and improvements that can be read about here. This blog is a few notes on getting started with those features and a few comments on why they are important.

Ehcache Search

This is one of the most requested features for Ehcache. I can't count how many times that I have had a map or cache and just wanted to quickly and easily search for an entry. Sure, one can write an iterator and use Java's matchers but that's

A bit of annoying coding
Only practical for unclustered caches

Ehcache 2.4 Beta 1 has a new search interface in core Ehcache that works throughout our scale continuum. We used the "Fluent Interface/DSL" style that we've been moving towards over the last couple of years in both Ehcache and Quartz Scheduler. This style of interface makes an API easy to use and read like english.

To learn more about it check out this self contained sample app and the documenation. Give us feedback early and often on the Ehcache Forums as the API is still under development and can still be improved.

Local Transactions

Back in Ehcache 2.0 we added JTA support. This was helpful but we found that for some use-cases people wanted a few things.

Transactions without a JTA transaction manager
More speed

We improved the speed across the board of our XA compliant JTA stuff and we also added the concept of local transactions. Local transactions provide optimistic concurrency and atomicity across caches. They are like XA transactions but aren't completely tolerant of external resource crashes outside of Ehcache (i.e. cache + databases don't have all the guarantees of XA in the node failure cases. Does provide the guarantees across Ehcaches), don't require a transaction coordinator (though can be used with them) and tend to be less resource intensive (read faster).

I've whipped up a little self contained sample here.

Improved Enterprise Disk Store

In 2.3 we added an Enterprise version of unclustered Ehcache core. The version added support for very large caches through the seamless BigMemory add-on (1 line of config). We also, somewhat more quietly, added the beginnings of a new Enterprise disk store that no longer stored keys in memory and can handle larger restartable caches.

2.4 beta 1 takes this to another level. The disk store is now way more efficient, better leverages SSD drives and can efficiently grow to very large sizes without any additional heap usage or fragmenting. One can exercise the improvements using the Ehcache Pounder.

Other Improvements

While not everything for the final release was done in time for Beta 1 some significant improvements are in.

NonStopCache now built in. Rather than have to add a jar and configure a wrapper to get the non-stop characteristics in clustered land this is now built into the product core and be turned on via configuration
Search now works clustered - The new search API is backed by the Terracotta tier. This is still early and we have a lot of performance and HA work to do here. That said, it is testable and usable so give it a try.
Explicit locking module is now in the core kit
Rejoin now works in non-stop (You can disconnect from a cluster and reconnect to that cluster without restarting)

For those interested in scheduling we have also released Quartz Scheduler 2.0 beta 1. More stuff will be coming in a month or so when we do beta 2 for all products.

Monday, November 15, 2010

Direct Buffer Access Is Slow, Really?

Lots of claims in the blog sphere around direct memory buffers being slow. We have been working with them a lot and hadn't seen that slowness but I'm more of a try it and see kinda guy so I did just that.

I cooked up a quick single threaded test to compare offheap direct memory buffers to on heap byte buffers. What I found is that at least on my notebook direct is between 2-5 percent of onheap. Even on my notebook I was writing and reading 2G/second in small chunks to random parts of the byte buffer (Every operation on the test does one write and one read).

The nice thing about them is they occupy no heap so the data stored in them is hidden away from the JVM GC. Of course whether that 2 to 5 percent matters depends on how much data your app is trying to crank through them so as always you need to look at your use-case's, latency, through-put and SLA goals and code accordingly.

While more testing can and should(and has) be done, here is a place for people to start.

Here's the code and my results from my 1.6 ghz notebook. I only spent a few minutes on it so suggestions to improve it are welcome:

Type: ONHEAP Took: 8978 to write and read: 10737418368

Type: DIRECT Took: 9223 to write and read: 10737418368

Type: ONHEAP Took: 8827 to write and read: 10737418368

Type: DIRECT Took: 9283 to write and read: 10737418368

Type: ONHEAP Took: 8813 to write and read: 10737418368

Type: DIRECT Took: 9604 to write and read: 10737418368

https://gist.github.com/700690

Friday, November 5, 2010

A Couple Minutes With Ehcache Search...

Welcome To Ehcache Search

Now that we are releasing our, GC Busting, Ehache BigMemory product we are starting the process of gathering feedback on the new Ehcache Search API. While Ehcache Big Memory solves your Latency, through-put and SLA issues, Ehcache Search gives you the ability to handle a users data retrieval needs in a simple and efficient way.

What is it?

With todays data explosion and hardware memory size explosion, in-memory and caching have become an important part of everyone's software. Now, getting at the data in that cache in a flexible, simple and fast way has really become an important area of focus. Things like finding entries that match criteria, performing aggregations and calculations (eg average, sum, max, min) need to be easy and efficient.

That's where the new Ehcache Search API comes in. It will provide fast efficient searches and aggregations across the Ehcache Snap in, Speed up, Scale out continuum (Single node performance through scaled out architecture).

Please try out the API (It's checked into Ehache core's trunk) by grabbing the this self contained, buildable sample and open source Ehcache nightly jar I've put up on GitHub.

https://github.com/sharrissf/Ehcache-Search-Sample/downloads

It's simple to use and 100 percent open source. We are very excited about feedback and plenty of time still exists for making changes and improvements. Please ask lots of questions and give lots of feedback.

To try it you can just unpack the kit:

tar -xzvf ehcache-search-sample-0.0.1-SNAPSHOT-distribution.tar.gz

and run it:

sh run.sh

The source code is included in the kit and can be edited in this directory:

src/main/java/org/sharrissf/sample/

rebuilt if you have Maven.

mvn clean install

And rerun using the run.sh command.

You can even package it up yourself by using

mvn clean assembly:assembly

Looking forward to hearing from the community. It will be interesting to see how projects like Grails, Hibernate, Liferay and Cold Fusion leverage cache search. Those uses will help set direction for years to come.

Wednesday, October 6, 2010

A Couple Minutes With Ehcache BigMemory Pounder...

Introduction

UPDATE: This product was released (No longer in beta)

Ehcache with BigMemory is now out in beta. One of the challenges when confronted with a new technology is exercising and understanding it's characteristics. Sure, you can read the docs, Google for some blogs, integrate it with your application and maybe write some samples, but that's a lot of work. Plus those approaches may not give you a clear picture of the various characteristics of the software. In order to make things a bit easier I've released a configurable pounder application for Standalone Ehcache with BigMemory.

Getting Installed

Here are the steps to get started:

Get the Ehcache with BigMemory Beta and a license key to use it.
Get the Standalone Ehcache Pounder distribution
Unpack the Ehcache with BigMemory distribution

tar -xzvf ehcache-core-ee-2.3-distribution.tar.gz

Copy the Standalone Ehcache Pounder kit into the ehcache kit and unpack it

cd ehcache-core-ee-2.3

cp ../ehcache-pounder-0.0.5-SNAPSHOT-distribution.tar.gz .

tar -xzvf ehcache-pounder-0.0.5-SNAPSHOT-distribution.tar.gz

Copy your license file and your ehcache core jar into the pounder kit

cd ehcache-pounder-0.0.5-SNAPSHOT

cp ../lib/ehcache-core-ee-2.3.jar .

and copy terracotta-license.key to the ehcache-pounder-0.0.5-SNAPSHOT as well.

Running the Pounder

Now you are ready to go. First take a look at the start script in one of the template directories (i.e. templates/1G-BigMemory):

sh run-pounder.sh

Out of the box it looks like this:

java -verbose:gc -Xms200m -Xmx200m -XX:+UseCompressedOops -XX:MaxDirectMemorySize=64G -cp ".:./jyaml-1.3.jar:./ehcache-pounder-0.0.5-SNAPSHOT.jar:./ehcache-core-ee-2.3.jar:slf4j-api-1.5.11.jar:slf4j-jdk14-1.5.11.jar" org.sharrissf.ehcache.tools.EhcachePounder

This should work fine for most people. It uses a small heap of 200 meg (you may need to grow this for really big caches). It is fine to have a MaxDirectMemorySize that is larger than your memory size on your machine. Just don't have a maxOffHeap size in your config.yml that is greater than your available physical memory on your machine. Also, make sure you leave room for your OS and the JVM. For example. If you have 8G of physical memory you might do a 400m java heap, a 6G offheap and leave the rest for the OS to use.

The script defaults to verbose gc because it's useful to see those stats and compressed oops BECAUSE YOU SHOULD BE USING A 64 BIT JVM and this setting makes it much more efficient (closer to 32 bit object pointers when possible using much less heap).

I tend to run this script like this:

./run-pounder.sh | tee out.txt

That way all output of the test goes to both the screen and to the file out.txt

Configuring the Pounder

You configure the pounder using the config.yml file. You can learn about all the options in the README. Here is the sample config included in the kit:

storeType: OFFHEAP

threadCount: 33

entryCount: 1000000

offHeapSize: "1G"

maxOnHeapCount: 5000

batchCount: 50000

maxValueSize: 800

minValueSize: 200

hotSetPercentage: 99

rounds: 40

updatePercentage: 10

diskStorePath: /export1/dev

NOTE: The most important setting here is the offHeapSize. If you set this to a number greater than the amount of memory you have on your machine you will not be happy.

The offHeapSize + the heap size + the amount of memory your OS needs together must be less than the amount of physical available memory on the machine.

Output

When running the output will look like the following:

size:492120 time: 7434 Max batch time millis: warmup value size:796 READ: 0 WRITE: 15151 Hotset: 99

size = total size of cache

time = total time to execute batch

max batch time = either warm up for the load phase or the longest time it took to execute a batch (note that in a multi-threaded test on a cpu bound machine the batch times are going to be impacted.)

value size = size of the value in this batch

READ = number of reads in the batch

WRITE = number of writes in the batch

hotset = percentage of the time that reads are done from the on heap cache

At the end of each round you'll see something like this:

Took: 10899 final size was 995240 TPS: 91751 MAX GET TIME: 20

Took = total time the round took

final size = the total size of the cache at the end (this can be impacted by eviction and be less than what you loaded)

TPS = total TPS during the run

MAX GET TIME = The maximum amount of time it took to get an entry

An interesting design choice of this pounder is that the threads you define do BOTH reading and writing. So the writers can starve out the readers in a hotset test. I could have gone he other way and then the readers could starve out the writers giving overly generous TPS.

What's Next

The biggest gap right now is that I didn't implement a way to specify a rate. It always goes full throttle. I'll try to add that when I get some time if people think it's useful.

Please give me lots of feedback so we can improve both the pounder and Ehcache

Learn more at:

The source for the pounder

http://terracotta.org/bigm emory

My other blog on BigMemory
Check the Ehcache BigMemory docs

Server Array BigMemory FAQ

Wednesday, September 15, 2010

A Little Bit About BigMemory for Ehcache and Terracotta ...

Big Memory?

In talking to our users it is clear that applications are getting more and more data hungry. According to IDC, data requirements are growing at an annual rate of 60 percent. This trend is driven further by cloud computing platforms, company consolidation and huge application platforms like Facebook. There is good news though. Server class machines purchased this year have a minimum of 8 Gig of RAM and likely have 32 Gig. Cisco is now selling mainstream UCS boxes with over 380 Gig of RAM (which I have tried and is amazing). On EC2 you can borrow 68.4 Gig machines for 2 dollars an hour (I have also tried this and it is also pretty amazing). Memory has gotten big and extremely cheap compared to things like developer time and user satisfaction.

Unfortunately a problem exists as well. For Java/JVM applications it is becoming an ever increasing challenge to use all that data and memory. At the same time that the data / memory explosion is occurring the amount of heap a Java process can effectively use has stayed largely unchanged. This is due to the ever increasing Garbage Collection pauses that occur as a Java heap gets large. We see this issue at our customers but we also see here at Terracotta tuning our products and the products we use like third party app servers, bug tracking systems CMS's and the like. How many times have you heard "run lots of JVM's" or "don't grow the heap" from your vendor's and/or devs?

So we set out to first identify the problem as it exists today, both in the wild and in-house. We then created a solution, first for us (an internal customer) and then for all of the millions of nodes of Ehcache out there (all of you)

3 Big Problems Seen by Java Applications

My Application is too slow

My application can't keep up with my users. I've got 10's of gigs of data in my database but it's over loaded and or too slow to service my needs. Either due to the complicated nature of my queriers or the volume of those queries. I want my data closer to the application so of course I start caching. Caching helps, but I want to cache more. My machine has 16 gigs of RAM but if I grow my heap that large, I get too many Java GC pauses.

My Application's latencies aren't predictable

On average my Java application is plenty fast but I see pauses that are unacceptable to my users. I can't meet my SLA's due to the size of my heap combined with Java GC pauses.

My software/deployment is too complicated

I've solved the Java GC problem. I run with many JVM's with heap sizes of 1-2 gigs. I partition my data and or loadbalance to get the performance and availability I need but my setup is complicated to manage because I need so many JVM's and I need to make sure the right data is in the right places. I fill up all 64 Gig's of RAM on my machine but it's too hard and fragile.

The other problem

Like many vendors, in the past we told our users to keep the heaps down under 6 gig. This forced our customers to not completely leverage the memory and or cpu on the machines they purchased and or stack JVM's on a machine. The prior is expensive and inefficient and the latter fragile and complex.

Here is a quick picture of what people do with their Java Applications today:

Base Case - Small heap JVM on a big machine because GC pauses are a problem
Big heap - That has long GC's that are complicated to manage
Stacked small JVM heaps - This in combination with various sharding, load balancing and clustering techniques is often used. This is complicated to manage and if all the nodes GC at the same time this can lead to availability problems.

What kind of solution would help?

Here's what we believe are the requirements for a stand-alone caching solution that attacks the above problems.

Hold a large dataset in memory without impacting GC (10s-100s of Gig) - The more data that is cached the less you have to go to your external data source and or disk the faster the app goes
Be Fast - needs to meet the SLA
Stay Fast - Don't fragment, don't slowdown as the data is changed over time
Concurrent - Scales with cpu and threads. No lock contention
Predictable - can't have pauses if I want to make my SLA
Needs to be 100 percent Java, work on your JVM on your OS
Restartable - A big cache like this needs to be restartable because it takes too long to build
Should just Snap-in and work - not a lot of complexity

What have we built?

First we built a core piece of technology, BigMemory, an off-heap, direct memory buffer store, with a highly optimized memory manager that meets and or exceeds requirements 1-6 above. This piece of technology is currently being applied in two ways:

1) Terracotta Server Array - We sold it to our built-in customer, the Terracotta Server Team, who can now create individual nodes of our L2 caches that can hold a hundred million entries, leverage 10's of gigs of memory, pause free and with linear TPS. This leverages entire machines (even big ones) with a single JVM for higher availability, a simpler deployment model, 8x improved density and rock steady latencies.

2) Ehcache - We've added BigMemory and a new disk store to Enterprise Ehcache to create a new tiered store adding in requirements 7-8 from above (snap-in simplicity and restart-ability). The Ehcache world at large can benefit from this store just as much as the Terracotta products do.

Check out the diagram below.

Typically, using either of the BigMemory backed products, you shrink your heap and grow your cache. By doing so SLA's are easier to meet because GC pauses pretty much go away and you are able to keep a huge chunk of data in memory.

Summing up

Memory is cheap and growing. Data is important and growing just as fast. Java's GC pauses are preventing applications from keeping up with your hardware. So do what every other layer of your software and hardware stack does: cache. But in Java, the large heaps needed to hold your cache can hurt performance due to GC pauses. So use a tiered cache with BigMemory that leverages your whole machine and keeps your data as close to where it is needed as possible. That's what Terracotta is doing for it's products. Do so simply, i.e. snap it in to Ehcache and have large caches without the pauses caused by GC. As a result create a simpler architecture with improved performance/density and better SLA's.

Learn more at http://terracotta.org/bigmemory
Check the Ehcache BigMemory docs

Server Array BigMemory FAQ

Wednesday, July 28, 2010

Application Scale and Quartz "Where"

I don't think it is to controversial to say that Quartz Scheduler is by far the dominant Open-Source Java job scheduler. It provides fast, flexible, and extremely reliable job execution and it is embedded in just about everything out there. In the last 6 months or so we have added a new Terracotta backed version of Quartz, lots of bug fixes and a GUI for monitoring quartz when clustered with Terracotta. We have a lot more coming. The team is hard at work on what's next. My favorite feature on the Quartz 2.0 list, the one they are working on right now, is Job Location Control or what is code named "Where".

Some of the trends we are seeing in our user base include the leveraging of EC2, larger apps/user bases, demanding HA requirements and hardware farms. As a result, scaled-out architectures are becoming common place in many IT environments. Scheduling work in these large multi-node environments is becoming a part of many software developers lives. Currently, Quartz supports scale-out but with little control over how the work is distributed. While this is a good start one quickly runs into problems like assigning jobs to machines that have the processing power to the work on them at the time the job is fired. Or execute the job where the data is local. Or just perform certain jobs on certain classes of machines do to their location or by purpose.

Well that is "where" we are headed. In the next few months the Quartz guys working with the Terracotta Guys and the Ehcache guys are developing a solution to the above set of problems. We'll be giving the same flexible and reliable scaled out scheduler but adding a new level of control. We are adding the "Where"

Stay tuned...

Thursday, July 22, 2010

Hiring at Terracotta - Performance/Testing Engineer and or Lead

Working at Terracotta is just about the best job someone could want (my humble opinion). Fast paced, super smart people, lots of interesting problems and widely adopted products wrapped up in a nice little package known as a fast growing startup. So send your resume now!

These positions can be either in San Francisco or Noida India.

About Terracotta:

Terracotta is the a fast growing company behind the most widely used software for application scalability, availability and performance. Our software is deployed in more than 250,000 enterprise installations, including the majority of the Fortune 2000.

Snap-in Scale and Performance

Terracotta's software products provide snap-in performance and scale for enterprise applications. With a simple change in two lines of configuration, Terracotta customers can run enterprise applications 10x faster and scale them―from one node, to 1000s, even to the cloud―without re-writing code or compromising performance or reliability.

A Leader in Distributed Caching

In “The Forrester Wave™: Elastic Caching Platforms, Q2 2010,” Forrester named Terracotta a Leader in this emerging market and ranked us strongest in strategy among eight elastic caching platforms.

http://www.terracotta.org

Where we are:

Our main headquarters are in San Francisco, CA in the SOMA area. We have an office in Noida, India and we have super star developers all over the world.

Description for LEAD QA Engineer

At Terracotta quality and stability in our product are our primary

objectives. Join our highly motivated, fast paced, agile, quality driven development team

where you will have many great opportunities to make an impact on

product capabilities and success in your role as QA Lead.

As a valued member of our tech lead team you will:

* Develop, maintain, and enhance both unit testing and performance

testing frameworks

* Design test strategies, develop test tools and implement test cases to

ensure highest quality deliverables for maintenace and new feature releases

* Improve the overall productivity of all of your co-workers by

identifying tools and processes to increase overall efficiency

* Create and maintain functional, performance, stress and endurance tests

* Diagnose and debug issues in a production environment

* Work closely with Engineering to understand the Product Architecture

and work on identifying, designing or enhancing existing test frameworks

to support backend test development

* Mentor and manage QA Engineers in a distributed team

Qualifications

* Proven track record as a lead in development and/or QA

* Motivated to improve existing processes, test strategies

* Strong knowledge of Java or other related programming languages

* Strong Knowledge in at least one scripting language such as Perl or Ruby

* Experience in creating back end test frameworks

* Ability to work independently to triage issues and prioritize tasks

* Strong understanding of QA Process

* Strong communication skills (verbal and written)

* Experience with code coverage and test tool development

* Experience with UNIX

* Experience with distributed caches, high availability products, and/or NoSQL solutions

* Experience with common java frameworks and containers such as Spring, Jetty, Hibernate, Ehcache, Quartz

* Ability to focus on multiple projects while in differing SDLC phases

If that's you and you meet most of the below criteria send us your resume careers@terracottatech.com.

Tuesday, July 20, 2010

Hiring at Terracotta...

About Terracotta:

Snap-in Scale and Performance

A Leader in Distributed Caching

http://www.terracotta.org

Where we are:

Our main headquarters are in San Francisco, CA in the SOMA area. We have super star developers all over the world.

Who we are looking for:

Have you written or worked on a distributed system, messaging system or NoSQL solution? Do you love cool, hard problems? Are you good at design API's that work well in todays Java Applications? Can you work both in a group and on your own?

Qualifications:

Works hard and solves hard problems
Strong OO/Framework design sense
Understand the Java Landscape in a deep way (i.e. Spring, J2EE, Ehcache, Quartz, Rest, SOAP, NoSQL)
Excited about performance, caching and scale-out
Love to code and write tests
Works well in a team and individually
Believes that the only way to know if something works is to test it in a repeatable way

Nice to have:

Experience with open-source
Live in/near San Francisco
Tech Lead experience

Responsibilities:

Design and build the next generation of scale-out, performance and HA software.
Contribute to and extend Terracotta, Ehcache and/or Quartz Scheduler, some of the most popular and widely used frameworks in Java

If that's you and you meet most of the below criteria send us your resume careers@terracottatech.com.

Thursday, June 24, 2010

A Couple Minutes With Some Toolkit Samples

Last night we released Beta 2 of the Terracotta 3.3 release. It has a number of improvements and updates to the Terracotta Toolkit including improved naming and factoring and additional clustered classes like AtomicLong and List. You can grab that or to get the absolute latest you can grab a nightly (my preference because the beta doesn't work with maven)

I pushed a couple of toolkit samples to GitHub (which by the way is awesome!). I didn't do hardly any cleanup or comments so just post questions to the blog if you need help.

http://github.com/sharrissf/Terracotta-Tools-and-Samples/

If your a Git user already I don't need to tell you how to check it out. If your not just hit "download source" on the upper right of the web page and don't worry about Git.

Four examples are included:

PlayingWithMapOfLocksExpress - A quick sample that show's how to create a clustered Map of Locks.

PlayingWithToolkitBarrier - Example of using a Cyclic barrier to coordinate between processes

PlayingWithToolkitClusterInfo - Example showing how to register a listener for cluster events.

PlayingWithToolkitQueue - Little sample on using a queue between two nodes

PlayingWithToolkitClusterCounter - Basic sample that show's using a clustered atomic long

You can run these from the command line by:

Downloading and unpacking the nightly build or beta from Terracotta
Staring the Terracotta server by calling the ./bin/start-tc-server.sh
Running 2 instances of the compiled versions of any of the above

java -cp target:INSTALL_DIR/common/terracotta-toolkit-1.0-runtime-ee-1.0.0-SNAPSHOT.jar PlayingWithToolkitClusterInfo

You can also run them using the Terracotta Maven Plugin:

mvn tc:run

To switch between the samples edit the pom.xml

Wednesday, June 9, 2010

A Couple Minutes With The Terracotta 3.3 Beta

With Terracotta platform version 3.3 our goal of creating an accessible application performance and scale-out solution is taking another big step forward. It's important to us that developers can find success using the ubiquitous Ehcache for performance, Hibernate for an ORM, Http Web Sessions for user state and Quartz for scheduling in a single node application. Then, without much thought or effort, add a couple lines of config to achieve scale-out and HA serving needs all the way through enterprise apps and into the cloud.

We have focused on a couple of high level areas in this release to take our solution to the next level.

Simple Scale - Reduce the need for tuning and tweaking with an improved next gen datastore. It will allow the everyday user to achieve the kinds of scale needed for massive applications both in data size and number of nodes.

Improved Visibility - We have added panels for Quartz and Sessions to our developer console giving full visibility to the full suite of performance and scale-out products. We have also added a more product focused organization of information in the tool.

Simple HA - A new panel that makes it easier to monitor interesting events that occur in a cluster. Pre-built templates for various configurations. Simplified way of migrating nodes, better defaults.

Modularity - We have exposed some of our most powerful pieces and parts as a versioned standard API that can be simply coded against to get things like, locking, queuing, maps and cluster topology. We use this API to build all four of our core products (Ehcache, Quartz, Hibernate 2nd level cache, Web Sessions).

While not everything made into the beta, enough is there to get a taste of where we are going. I encourage people to download it and try it as soon as possible. GA is just around the corner. Learn more on the release notes page.

Tuesday, May 18, 2010

Steve Jobs Stanford Commencement

Was reading a couple stories on the web this morning and I ran into two somewhat interesting lists over at Time Magazine. I kinda mocked the first one upon reading the title "Top 10 College Dropouts." I laughed and said to my Coworker and friend Orion, "Must be a slow news day!" Followed by spending the next 10 or 15 minutes dissecting the article. How many were either from or lived in California? how many dropped out of Harvard? Who the heck is Buckminster Fuller? "About 10 minutes in Orion rightly pointed out, "Article can't be too bad since you've spent the last 10 minutes talking about it." Of course he was right.

Followed that up by clicking a link taking me to the next list called "Top 10 Commencement Speeches." I know what your thinking now, "How much free time does this guy have?" I was enjoying a little candy after a hard working morning ;-). Anyway, I digress. I noticed that one of the names was on both lists. You guessed it, Steve Jobs, someone that anyone who has ever built anything of any kind should at least respect.

I set about to watch the commencement speech he gave in 2005. He basically tells 3 stories from his life experience. Much like Randy Pausch's last lecture their is a lot to be learned here. The first two stories are awesome and inspirational and many of us can relate to and learn from them. The third is also good/inspirational but a bit dark and kind of scary and depressing.

Anyway, since in this speech Steve talks so much about time, I will lead in with the opinion that listening to this commencement is time well spent. Check it out!

Friday, May 7, 2010

A Couple Minutes With Non-Stop Ehcache

Sometimes you just want to have an SLA (Service Level Agreement) when dealing with a component like a cache. You want to know that no matter what goes wrong, disk failures, deadlocks, network down, database locked up (for write-through) your threads won't be blocked for longer than your SLA. In the security world this is analogous to Layered Security where one is protected at multiple layers from breaches. In this case you are protected at multiple layers from hangs.

Non-Stop Ehcache

In comes NonStopCache. This is a decorator for Ehcache that allows one to specify an SLA. When using this decorator no operation is allowed to take longer than the SLA provided. All operations are isolated from the cache via a thread pool providing complete protection. The NonStopCache has multiple ways of configuring it. One can setup whether to serve stale data, perform noops or throw exceptions when SLA's are being violated covering most of the cases one needs.

Getting Started

Here's what I did to get started using clustered Ehcache and NonStopCache:

Download Ehcache 2.1
Download the NonStopCache 1.0
Start the Terracotta server

ehcache-2.1.0-beta/terracotta/bin/start-tc-server.sh

run the below program (source code below)

java -cp .:ehcache-core-2.1.0-beta.jar:slf4j-api-1.5.8.jar:slf4j-jdk14-1.5.8.jar:ehcache-terracotta-2.1.0-beta.jar:ehcache-nonstopcache-1.0.0-beta.jar MyFirstNonStopEhcacheSample

NOTE: Kill the Terracotta server when the printed output instructs you too.

Let's review the output:

Regular cache. No Decorator

The size of the cache is: 0

After put the size is: 1

Here are the keys:

Key:0

Done with cache.

Sleeping, Stop your server

Disconnected NonStop with noop cache.

The size of the cache is: 0

After put the size is: 0

Here are the keys:

Done with cache.

Disconnected NonStop with local reads cache.

The size of the cache is: 1

After put the size is: 1

Here are the keys:

Key:0

Done with cache.

Disconnected NonStop with exception cache.

Exception in thread "main" net.sf.ehcache.constructs.nonstop.NonStopCacheException: getKeys timed out

at net.sf.ehcache.constructs.nonstop.behavior.ExceptionOnTimeoutBehavior.getKeys(ExceptionOnTimeoutBehavior.java:114)

at net.sf.ehcache.constructs.nonstop.behavior.ClusterOfflineBehavior.getKeys(ClusterOfflineBehavior.java:120)

at net.sf.ehcache.constructs.nonstop.NonStopCache.getKeys(NonStopCache.java:264)

at MyFirstNonStopEhcacheSample.addToCacheAndPrint(MyFirstNonStopEhcacheSample.java:45)

at MyFirstNonStopEhcacheSample.(MyFirstNonStopEhcacheSample.java:40)

at MyFirstNonStopEhcacheSample.main(MyFirstNonStopEhcacheSample.java:60)

What Just Happened?

The cache is first loaded into an ordinary undecorated cache. This is performed before the server kill and proceeds without incident. The next round of operations on the cache were performed with the server down.

Since that cache is configured as noop all operations are ignored and the size of the cache appears as zero. This is great for when you just want to not bother with the caching if it isn't available.

Next round is local reads. This allows you to use the values that are available locally while the cache is unavailable. This is especially nice for read only or mostly read only caches that fit in memory and are always available.

After that we go to the exception version. If you have a cache of important coherent data but you don't want threads blocked this setting is for you. It will blow cache operations out of the thread so your container can just keep going by either showing an error page or getting the data from elsewhere.

Some interesting points here:

These decorators are all being used on the same cache. This way you can make the behavior specific to the user of the cache. It gives tremendous flexibility.
You'll notice this little sample flies through despite the timeout being set to 13 seconds. This is because it's in fail fast mode. In this configurable mode if the cache knows it can't communicate it will return the failure case immediately. If that's not what you want you can instead set it up to not fail fast and wait the full timeout no matter what.
I did this work in config but the same setup can be done in code

The Code

MyFirstNonStopEhcacheSample.java

And the Config file ehcachenonstop.xml:

Saturday, May 1, 2010

A Couple Minutes With Terracotta Toolkit Nightly

A couple days ago I wrote a short blog about our goal of creating a Terracotta Toolkit for others to easily build simple to use scaled out frameworks, tools and in some cases apps. This same exact toolkit is used to build Ehcache, Hibernate, Http Sessions and Quartz with Scale and HA just by adding 2 lines of config. I was pretty shocked and excited about what a tremendous reaction this blog and the toolkit received. Usually when I write a blog most of it's traffic comes from awesome sites like DZone but this blog was different. I actually got considerably more direct traffic from things like e-mails and twitter than anything conventional. That kind of viral excitement shows a much deeper level of curiosity and is tremendously motivating for the team here. Anyway, I digress.

Me being the curious type I decided to do something a little dangerous yesterday. I asked Geert, one of the leads on the Terracotta toolkit project, if we were in a state where I could play around with it a bit and get a sense of how hard it is to get started. We discussed the usual caveats, "We haven't agreed on some of the naming and factoring stuff yet?" which I clearly understood. Knowing that fact and that I'd be working off a nightly(this link downloads it) I set out to follow his simple instructions and see just how easy it was. Here is what I did.

Geert gave me a simple cyclic barrier sample to work off of. This is a sample I've whipped up my self a few times over the years. Many of our earliest users leveraged Terracotta strictly as a distributed lock manager and that's what this simple sample does as well. You start the app with a barrier name and a node count and it just hangs until the node count is reached. Seems like as good of a sample as any to highlight how to get started with the toolkit.

Downloaded the nightly and unpacked

tar -xzvf terracotta-trunk-nightly-rev15534.tar.gz

Grabbed a quick sample app

import org.terracotta.api.ClusteringToolkit;

import org.terracotta.api.TerracottaClient;

import org.terracotta.coordination.Barrier;

public class PlayingWithExpressBarrier {

public static void main(String[] args) {

final String barrierName = args[0];

final int numberOfParties = Integer.parseInt(args[1]);

//Start the Terracotta client

ClusteringToolkit clustering = new TerracottaClient(

"localhost:9510").getToolkit();

//Get an instance of a barrier by name

Barrier barrier = clustering.getBarrier(barrierName,

numberOfParties);

try {

System.out.println("Waiting ...");

int index = barrier.await();

System.out.println("... finished " + index);

} catch (Exception e) {

e.printStackTrace();

}

I worked in eclipse so at this point all I had to do is add the toolkit jar to the classpath to get it to compile

terracotta-trunk-nightly-rev15534/common/terracotta-toolkit-1.0-runtime-1.0.0-SNAPSHOT.jar

Now kickoff the Terracotta server

terracotta-trunk-nightly-rev15534/bin/start-tc-server.sh

And run the sample 3 time

java -cp .:terracotta-toolkit-1.0-runtime-1.0.0-SNAPSHOT.jar barrierName 3

The first two nodes should hang waiting for the 3rd node. Once the 3rd node is started all three should run to completion. This is just a little taste of what is coming.

The toolkit will have highly concurrent maps, locks, counters, queues, evictors and more so stay tuned. All of the toolkit classes will have tests in our TCK. The toolkit will run on any 1.5/1.6 JVM and requires no boot-jars, agents or container specific code. Given versions of the TCK will allow people to just drop in that toolkit jar at runtime for any app developed using that version of the API provided by the version of the TCK. I'll talk a bit more about how versioning the toolkit API will work in a future blog but it will work like any other spec. You can use any implementation that implements the spec. Enjoy, much more to come.

Thursday, April 29, 2010

Countdown To The Terracotta Toolkit Beta

Countdown To The Terracotta Toolkit Beta

Over the last year, Terracotta has launched four great products backed by the snap-in HA and scalability of the Terracotta Server Array: Enterprise Ehcache, BigMemory for Enterprise Ehcache, Web Sessions and Quartz Scheduler. Building these products has left us feeling a bit greedy. In order to free ourselves from this guilt we've created a toolkit that has allowed us to rapidly build these easy-to-use, high-scale/HA products.

The Revolution Begins

So why should we be the only ones who have all the fun. Lots of smart people out there can build clustered caches, frameworks, and tools? So we are creating the Terracotta toolkit and adjacent Terracotta Compatibility Kit. We will be releasing a set of standard parts like highly concurrent distributed maps, clustered evictors, queues, locks, counters, cluster events etc that can be used by anyone so that they might realize their vision of what the developers of the world should use. This kit was built with a few core goals in mind:

Ease of use is paramount - For both the developers that leverage the Terracotta toolkit and the people who use the stuff built using the Terracotta toolkit
Stable API matters - We are building a compatibility kit and will maintain a strict and clear versioning scheme so that framework developers can rely on and clearly know what versions of Terracotta can work with the API version used in the application. Your users can just drop in any version of the terracotta-toolkit.jar that implements the version of the API you coded against.
Parts is Parts - Get all the useful parts we use to build our products packaged and out for others to use.
Scale Continuum - The parts should work both clustered and unclustered continuing our vision of a scale continuum.

The beta of this toolkit is a few short weeks away and we can't wait to see what great stuff people will build on it. Come show us how a cache should be built, come show us what tools we haven't even thought of.

With this toolkit pretty much any framework developer will be able to build out their HA, distributed and scale out dreams.

Monday, April 26, 2010

Dave Klein's Scale Grails Webinar

As someone who spent years in the Smalltalk world I have a special place in my heart for powerful/flexible frameworks and languages that strive for simplicity for the developer. Grails is one of the most powerful and flexible frameworks around and modern representation of those Smalltak values. Some would argue that it's only major missing features are scale-out and HA. The great news on that front is that Grails is built on Quartz, Ehcache, Http Sessions and Hibernate all of which can be scaled with Terracotta in a couple lines of config each. By scaling these four frameworks you can completely scale Grails.

Check out this webinar from the guy who wrote the book on Grails Dave Klein and and learn about it for yourself:

Thursday, April 22, 2010

<terracotta clustered="true"/>

One challenge with drastically simplifying Terracotta as a tool for application scale is getting the word out that it's occurred!

The subject of this blog is one of two lines of config and a jar on the classpath that move Ehcache to be a coherent distributed cache. The line of config in the subject says, "Hey, cluster this cache." The other line, which looks something like this:

"<terracottaConfig url="localhost:9510" />"

tells Ehcache where to find the Server Array.

This same story is true for scaling out Http Sessions, Quartz Scheduling and Hibernate Caching. A whole new world of scale is available to the applications we all write. It free's us from complex sharding, awkward CAP trade offs and endless DB tuning. And the cherry on top is it's open source.

No recompile, no code changes, no magic tricks or additional knowledge. I could train my dog to scale an application built on these ubiquitous frameworks.

Check out these blogs to learn more:

Add Scale and HA to Ehcache In Less Than 5 Minutes

Add Scale and HA to Quartz Scheduler In Less Than 5 Minutes

Tuesday, April 20, 2010

Ehcache 2.1 Beta - Lots of Stuff, Still Backward Compatible

UPDATE: 2.1 went GA March 21st

The Ehcache dev team is pretty excited to be getting 2.1 beta out for the world to try. This release is focused on 3 primary goals:

Build on our vision of an application scale continuum from one node to the cloud.
Improve Ehcache performance both unclustered and clustered.
Improve Ehcache applicability for both unclustered and clustered.

What did we do in the name of these goals? Good question!

Scale Continuum

In this release we are focused on taking features that had been added for the Terracotta Clustered version of Ehcache and extending them back to the unclustered version.

The following features fall into that category:

The Explicit Locking Module - This module allows you to acquire and release locks manually for given keys. It required some significant rework in the unclustered stores but this now works just as well unclustered as it does clustered supporting fully coherent operations.
JTA - In 2.0 of Ehcache we added JTA support when clustered via Terracotta. In 2.1 we extended that functionality to unclustered and have begun the process of performance tuning to go along with its XA compliance.

Performance

While performance wasn't a big focus of this release (It is for the next release) we were able to considerably improve Ehcache "put" performance due to the significant work we did on the disk store and locking architecture. We'll also be spending some time tuning JTA but that work did not make it into the beta.

Applicability

Here we did some considerable work.

JTA for Hibernate Second Level Cache - We added support for using Ehcache JTA in a second level cache both clustered and unclustered.
UnlockedReadsView - This is a subtle but important feature. For those who are using a coherent cache but have some part of an application that needs to be able to read at high rates without impacting the rest of the cache this view is a huge help.
NonStopCache - Useful for guaranteeing that your cache can never stop your application. On a per cache basis an application can avoid holdups caused by problems such as a slow disk in an unclustered cache or a network outage in a clustered one.
New Coherent methods - We've added useful methods like putIfAbsent and replace to simply and easily work with a clustered or unclustered cache in a fully coherent manner. Together with the explicit locking wrapper much is possible.
We also added a bunch of tests and bug fixes to the web-cache, an extremely useful tool for making performant web applications.

Summing up...

While still in beta we feel like Ehcache 2.1 is another exciting step for our product family. It adds performance, significant features and is still backward compatible. It continues our vision of an application scale continuum from one node to the cloud without burdening the developer with complexity. One set of frameworks, one application, scale out is added at runtime. We are really excited about this vision and are working tirelessly at extending it to all our users' needs.

Please Download Ehcache 2.1 Beta, put it through it's paces and give us lots of feedback!

Monday, April 19, 2010

Application Server Instead Of Web Server

I'm starting to see a small but growing group of sites forgoing web servers and going straight to the Application Server in the Java world. This has some obvious advantages:

Less infrastructure
Java doesn't have buffer overflows and is a bit more secure
You can do much more interesting caching by having the web serving and app serving in the same layer

I was excited to see this blog today and just wanted to point people to it and see what thoughts are out there both pro and con on this trend.

Web Application Architecture - do I really need Apache?

Wednesday, February 3, 2010

How Much Is Google Really Spending?

This will be a short blog.

Has anyone else noticed that google is using adwords extensively to pimp the nexus one? They must be spending a fortune in lost revenue from other ad sources. Since they don't have to pay for those ads I wonder if even they have an accounting of how much they are spending on this thing. At first I thought the nexus one looked interesting but...

T-Mobile only

No Exchange support

weird app space restrictions

and from what I've read poor touch screen keyboard

On the plus side:

Nice screen

Solid camera with flash

really fast processor

With this kind of marketing muscle it's probably just a matter of time before it gets wide adoption but I suspect it's still a couple years away from being really good.