High Performance Networking in Google Chrome

High Performance Networking in Google Chrome – igvita.com

Chrome’s multi-process architecture carries important implications for how each network request is handled within the browser. Under the hood, Chrome actually supports four different execution models that determine how the process allocation is performed. By default, desktop Chrome browsers use the process-per-site model, that isolates different sites from each other, but groups all instances of the same site into the same process. However, to keep things simple, let’s assume one of the simplest cases: one distinct process for each open tab. From the network performance perspective, the differences here are not substantial, but the process-per-tab model is much easier to understand. The architecture dedicates one render process to each tab, which itself contains an instance of the WebKit open-source layout engine for interpreting and layout out the HTML (aka, “HTML Renderer” in the diagram), an instance of the V8 JavaScript engine, and the glue code to bridge these and a few other components. If you are curious, the Chromium wiki contains a great introduction to the plumbing.

Cassandra 1.2 moves internals off-heap

Performance improvements in Cassandra 1.2 | DataStax

Disk capacities have been increasing. RAM capacities have been increasingly roughly in step. But the JVM’s ability to manage a large heap has not kept pace. So as Cassandra clusters deploy more and more data per node, we’ve been moving storage engine internal structures off-heap, managing them manually in native memory instead. 1.2 moves the two biggest remaining culprits off-heap: compression metadata and per-row bloom filters. Compression metadata takes about 20GB of memory per TB of compressed data. Moving this into native memory is especially important now that compression is enabled by default. Bloom filters help Cassandra avoid scanning data files that can’t possibly include the rows being queried. They weigh in at 1-2GB per billion rows, depending on how aggressively they are tuned. Both of these use the existing sstable reference counting with minor tweaking to free native resources when the sstable they are associated with is compacted away.

Performance triage

Performance triage (David Dice’s Weblog)

Lets say I have a running application and I want to better understand its behavior and performance. We’ll presume it’s warmed up, is under load, and is an execution mode representative of what we think the norm would be. It should be in steady-state, if a steady-state mode even exists. On Solaris the very first thing I’ll do is take a set of “pstack” samples. Pstack briefly stops the process and walks each of the stacks, reporting symbolic information (if available) for each frame. For Java, pstack has been augmented to understand java frames, and even report inlining. A few pstack samples can provide powerful insight into what’s actually going on inside the program. You’ll be able to see calling patterns, which threads are blocked on what system calls or synchronization constructs, memory allocation, etc. If your code is CPU-bound then you’ll get a good sense where the cycles are being spent.

Spanner: Google’s Globally-Distributed Database

Spanner is Google’s scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner.

JavaOne ’12

Yes, the big names, the legends and the heroes are no more but fret not Oracle has managed to pull a not so shabby lineup of speakers for JavaOne 2012. Yes, it’s maturing, and maybe there’s just less to talk about, and Java’s a teenager now!

So if I do get to (thank you Oracle for the pass!) J1 this year, here are the talks I’ll be attending:

CON3586 – Dealing with JVM Limitations in Apache Cassandra
Jonathan Ellis, CTO , DataStax

CON3753 – Delivering Performance and Reliability at the World’s Leading Futures Exchange
Rene Perrin – Technical Specialist Software Engineer, CME Group

CON6583 – G1 Garbage Collector Performance Tuning
Charlie Hunt – Architect, Performance Engineering, Salesforce.com
Monica Beckwith – Principal Member of Technical Staff, Oracle

CON11233 – Detecting Memory Leaks in Applications Spanning Multiple JVMs
Albert Mavashev – CTO, Nastel Technologies, Inc.

CON6465 – JVM Support for Multitenant Applications
Graeme Johnson – Cloud JVM Architect, IBM CORPORATION

CON6703 – ARM: Eight Billion Served—“Want That Java Superoptimized?”
Andrew Sloss – Senior Pricinpal Engineer, ARM
Bertrand Delsart – Consulting Member of Technical Staff, Oracle

BOF6308 – Showdown at the JVM Corral
John Duimovich Duimovich – Java CTO, IBM Canada Ltd.
Mikael Vidstedt – JVM Architect, Oracle

I don’t want to die in a language I can’t understand – Dick Gabriel

Dick Gabriel, a legend – “scholar, scientist, poet, performance artist, entrepreneur, musician, essayist, and yes, hacker…” speaks at Clojure/West.

Richard P. Gabriel expands upon “Mixin-based Inheritance” by G. Bracha and W. Cook, observing that software engineering precedes science and incommensurability can be used to detect paradigm shifts.