New Gecko performance tool: Backtrack

Backtrack aims to show a complete code path flow from any point back to its source, crossing asynchronous callbacks, threads, processes, network requests, timers and any kind of implementation specific queuing plus capturing any I/O or mutex blockade.  The ‘critical flow execution path’ is put to a context of all the remaining concurrent execution flows.  It’s then easy to examine how the critical flow is blocked and delayed by concurrent tasks.

The work is tracked in this bug, where you also find patches and build instructions.  There is also an add-on that, in Backtrack enabled builds, allows you to view actual captured data.

Click the screenshot bellow to view an interactive previewIt’s capture of load of my blog main page till the first-paint notification (no e10s and no network predictor to demonstrate the capture capabilities.)

backtrack-preview-1

Backtrack combines*) Gecko Profiler and Task Tracer.

Gecko Profiler (PSP) provides instrumentation (already spread around the code base) to capture static call stacks.  I’ve enhanced the PSP instrumentation to also capture objects (i.e. 'this' pointer value) and added a simple base class to easily monitor object life time (classes must be instrumented.)

Task Tracer (TT) on the other hand provides a generic way to track back on runnables – but not on e.g. network poll results, network requests or implementation specific queues.  It was easy to add a hook into the TT code that connects the captured object inter-calls with information about runnables dispatch source and target.

The Backtrack experimental patch:

  • Captures object lifetime (simply add ProfilerTracked<class Derived> as a base class to track the object lifetime and class name automatically)
  • Annotates objects with resource names (e.g URI, host name) they work with at run-time
  • Connects stack and object information using the existing PROFILER_LABEL_FUNC instrumentation recording this pointer value automatically ; this way it collects calls between objects
  • Measures I/O and mutex wait time ; an object holding a lock can be easily found
  • Sticks receipt of a particular network response exactly to its actual request transmission (here I mainly mean HTTP but also applies to connect() and DNS resolution)
  • Joins network polling “ins” and “outs”
  • Binds code-specific queuing and dequeuing, like our DNS resolver, HTTP request queues.  Those are not just ‘dispatch and forget’ like nsIEventTarget and nsIRunnable but rather have priorities, complex dequeue conditions and may not end up dispatched to just a single thread.  These queues are very important from the resource scheduling point of view.

Followups:

  • IPC support, i.e. cross also processes
  • Let the analyzes also mark anything ‘related’ for achieving a selected path end (e.g. my favorite first-paint time and all CSS loads involved)
  • Probably persist the captured raw logs and allow the analyzes be done offline

Disadvantages: just one – significant memory consumption.

*) The implementation is so far not deeply bound to SPS and TT memory data structures.  I do the capture my own – actually a third data collection, side by SPS and TT.  I’m still proving the concept this way but if found useful and bearable to land in this form as a temporary way of collecting the data, we can optimize and cleanup as a followup work.

Částečné zatmění slunce 2015

Je to hodně z ruky, selhala dálková programovatelná spoušť a na focení po přesných intervalech jsem byl prostě líný :) Aligning není úplně přesný, ale mě se to líbí i tak.

Stativ, Canon EOS 60D, Canon EF 200/2.8 L II, ND8 + Baader Astrosolar.  Každý snímek animace cca 6 – 10 RAW obrazů @ ISO 100, 1/125s, F/4, bez flatfield.  Registax 6.

partial-solar-eclipse-2015-animation

Firefox HTTP cache v1 API disabled

Recently we landed the new HTTP cache for Firefox (“cache2”) on mozilla-central.  It has been in nightly builds for a while now and seems very likely to stick on the tree and ship in Firefox 32.

Given the positive data we have so far, we’re taking another step today to making the new cache official: we have disabled the old APIs for accessing the HTTP cache, so addons will now need to use the cache2 APIs. One important benefit of this is that the cache2 APIs are more efficient and never block on the main thread.  The other benefit is that the old cache APIs were no longer pointing at actual data any more (it’s in cache2) :)

This means that the following interfaces are now no longer supported:

  •   nsICache
  •   nsICacheService
  •   nsICacheSession
  •   nsICacheEntryDescriptor
  •   nsICacheListener
  •   nsICacheVisitor

(Note: for now nsICacheService can still be obtained: however, calling any of its methods will throw NS_ERROR_NOT_IMPLEMENTED.)

Access to previously stored cache sessions is no longer possible, and the update also causes a one-time deletion of old cache data from users’ disks.

Going forward addons must instead use the cache2 equivalents:

  •   nsICacheStorageService
  •   nsICacheStorage
  •   nsICacheEntry
  •   nsICacheStorageVisitor
  •   nsICacheEntryDoomCallback
  •   nsICacheEntryOpenCallback

Below are some examples of how to migrate code from the old to the new cache API.  See the new HTTP cache v2 documentation for more details.

The new cache2 implementation gets rid of some of terrible features of the old cache (frequent total data loss, main thread jank during I/O), and significantly improves page load performance.  We apologize for the developer inconvenience of needing to upgrade to a new API, but we hope the performance benefits outweight it in the long run.

Example of the cache v1 code (now obsolete) for opening a cache entry:

var cacheService = Components.classes["@mozilla.org/network/cache-service;1"]
                   .getService(Components.interfaces.nsICacheService);

var session = cacheService.createSession(
  "HTTP",
  Components.interfaces.nsICache.STORE_ANYWHERE,
  Components.interfaces.nsICache.STREAM_BASED
);

session.asyncOpenCacheEntry(
  "http://foo.com/bar.html",
  Components.interfaces.nsICache.ACCESS_READ_WRITE,
  {
    onCacheEntryAvailable: function (entry, access, status) {
      // And here is the cache v1 entry
    }
  }
);

Example of the cache v2 code doing the same thing:

let {LoadContextInfo} = Components.utils.import(
  "resource://gre/modules/LoadContextInfo.jsm", {}
);
let {PrivateBrowsingUtils} = Components.utils.import(
  "resource://gre/modules/PrivateBrowsingUtils.jsm", {}
);

var cacheService = Components.classes["@mozilla.org/netwerk/cache-storage-service;1"]
  .getService(Components.interfaces.nsICacheStorageService);

var storage = cacheService.diskCacheStorage(
  // Note: make sure |window| is the window you want
  LoadContextInfo.fromLoadContext(
    PrivateBrowsingUtils.privacyContextFromWindow(window, false)),
    false
);

storage.asyncOpenURI(
  makeURI("http://foo.com/bar.html"),
  "",
  Components.interfaces.nsICacheStorage.OPEN_NORMALLY,
  {
    onCacheEntryCheck: function (entry, appcache) {
      return Ci.nsICacheEntryOpenCallback.ENTRY_WANTED;
    },
    onCacheEntryAvailable: function (entry, isnew, appcache, status) {
      // And here is the cache v2 entry
    }
  }
);

 

There is a lot of similarities, instead of a cache session we now have a cache storage having a similar meaning – to represent a distinctive space in the whole cache storage – it’s just less generic as it was before so that it cannot be misused now.  There is now a mandatory argument when getting a storage – nsILoadContextInfo object that distinguishes whether the cache entry belongs to a Private Browsing context, to an Anonymous load or has an App ID.

(Credits to Jason Duell for help with this blog post)