Illusion of atomic reference counting

22nd July, 2016

Most people believe that having an atomic reference counter makes them safe to use RefPtr on multiple threads without any more synchronization.  Opposite may be truth, though!

Imagine a simple code, using our commonly used helper classes, RefPtr<> and an object Type with ThreadSafeAutoRefCnt reference counter and standard AddRef and Release implementations.

Sounds safe, but there is a glitch most people may not realize.  See an example where one piece of code is doing this, no additional locks involved:

RefPtr<Type> local = mMemeber; // mMember is RefPtr<Type>, holding an object

And other piece of code then, on a different thread presumably:

mMember = new Type(); // mMember's value is rewritten with a new object

Usually, people believe this is perfectly safe.  But it’s far from it.

Just break this to actual atomic operations and put the two threads side by side:

Thread 1

local.value = mMemeber.value;
/* context switch */ 
.
.
.
.
.
.
local.value->AddRef();

Thread 2

.
.
Type* temporary = new Type();
temporary->AddRef();
Type* old = mMember.value; 
mMember.value = temporary; 
old->Release(); 
/* context switch */ 
.

Similar for clearing a member (or a global, when we are here) while some other thread may try to grab a reference to it:

RefPtr<Type> service = sService;
if (!service) {
  return; // service being null is our 'after shutdown' flag
}

And another thread doing, usually during shutdown:

sService = nullptr; // while sService was holding an object

And here what actually happens:

Thread 1

local.value = sService.value;
/* context switch */
.
.
.
.
local.value->AddRef();

Thread 2

.
.
Type* old = sService.value; 
sService.value = nullptr; 
old->Release(); 
/* context switch */
.

And where is the problem?  Clearly, if the Release() call on the second thread is the last one on the object, the AddRef() on the first thread will do its job on a dying or already dead object.  The only correct way is to have both in and out assignments protected by a mutex or, ensure that there cannot be anyone trying to grab a reference from a globally accessed RefPtr when it’s being finally released or just being re-assigned. The letter may not always be easy or even possible.

Anyway, if somebody has a suggestion how to solve this universally without using an additional lock, I would be really interested!

Backtrack meets Gecko Profiler

6th June, 2016

Backtrack is about to be a new performance tool, focused on revealing and solving scheduling and delay problems.  Those are big offenders of performance, very hard to track, and hidden from conventional profilers.

To find out how long and what all has to happen to reach a certain point – an objective, just add a simple instrumentation marker.  When hit during run, it’s added to a list you can then pick from and start tracing to its origin.  Backtrack follows from the selected objective back to the originating user input event that has started the whole processing chain.

The walk-back crosses runnables and their wait time in thread event queues, but also network requests and responses, any code specific queues such as DOM mutations, scheduled reflows or background JS parsing 1), monitor and condvar notifications, mutex acquirements 2), and disk I/O operations.

Visually the result is a single timeline – we can call it a critical path – revealing wait, network and CPU times as distinct intervals involved in reaching solely the selected objective.  Spotting mainly dispatch wait delays is then very easy.  The most important and new is that Backtrack tells you what other operations or events block (makes the critical path wait) and where from have been scheduled.  And more importantly, it recognizes which of them are (or are not) related to reaching the selected objective.  Those not related are then clear candidates for rescheduling.

To distinguish related and unrelated operations Backtrack captures all sub-tasks that are involved in reaching the selected objective.  Good example is the page first paint time – actually unsuppress of painting.  First paint is blocked by loading more than one resource, the HTML and head referenced CSS and JS.  These loads and their processing – the sub-tasks – happen in parallel and only completion of all of them unsuppresses the painting (said in a very simplified way, of course.)  Each such sub-task’s completion is marked with an added instrumentation.  That creates a list of sub-objectives that are then added to the whole picture.

Screen shot of how Backtrack is integrated to the Gecko Profiler Cleopatra web UI

Future improvements:

  • Backtrack could be used in our perfomance automation.  Except calculation of time between an objective and its input source event, it can also calculate CPU vs dispatch delays vs network response time.  It could also be able to filter out code paths clean of any outer jitter.
  • Indeed, networking has strong influence to load times.  Adding more detailed breakdown and analyzes how well we schedule and allocate network resources is one of the next steps.
  • Adding PCAPs or even let Backtrack capture network activity like Wireshark directly from inside Firefox and join it with the Gecko Profiler UI might help too.

The current state of Backtrack development is a work in active progress and is not yet available to users of Gecko Profiler.  There are patches for Gecko, but also for the Cleopatra UI and the Gecko Profiler Add-on.  The UI changes, where also the analyzes happens, are mostly prototype-like and need a clean up.  There are also problems with larger memory consumption and bigger chances to hit OOMs when processing the captured data with Backtrack captured markers.


1) code specific queues need to be manually instrumented
2) with ability to follow to the thread that was keeping the mutex for the time you were waiting to acquire it

 

Filter out errors out of mozilla build on command line

7th April, 2016

Mozilla build errors filtered once again under the build log

I wrote a small filter script that lists all build errors once again at the bottom of the whole build log, so that you don’t have to look for them like for a needle in a haystack. Something I wanted mach to do natively.  But I was always told something like “filter yourself”.  So here it is 🙂

  • Download this small script *)
  • Copy it to your source directory or somewhere your $PATH points to
  • run mach as: ./mach build | err

When there are errors during build, those will be listed under the build log and conveniently highlighted.

*) It’s tuned for and mainly targeting mingw, but might well work on linux/osx too.

INACCESSIBLE_BOOT_DEVICE on Windows 10 boot after update of the Intel Rapid Storage Techonology driver

16th September, 2015

Have a BSOD after you’ve installed the latest version of Intel RST, 14.6.0.1029, from Intel’s download center? During the boot, staring at the Windows logo and the spinning wheel, after a minute or two getting just INACCESSIBLE_BOOT_DEVICE error and “we must reboot” message?  Restarts, nothing helps? Yeah, I’ve been there.

  • Update: problem persists with Intel RST version 14.8.0.1042, despite some claims in their release notes the problem had been fixed.  It’s even worse, the system drive could not be found from withing the recovery command prompt.  This time I had to create a system restore USB drive on a different Win 10 machine (w/o iRST).  Renaming the file didn’t work.  Fortunately, I’ve created a restore point before running SetupRST.exe.  Restore worked well.
  • Another interesting thing is that the driver I currently use is the faulty 14.6.0.1029 version (by inspecting version of the sys file on disk directly.)  This would suggest that the driver is actually working, but the installation process screws something else up.  Since my system now works well and is vital for me, I’m not going to investigate further on this.

Two and a half hour of googling, reverting registry settings, looping in the auto recovery mode, until I found out myself a simple solution for reverting the Intel RST driver update:

Note: can be applied only when you have updated the driver from a previously working version, since it counts with a previous driver file stored on your disk.

  • Check your BIOS and RAID setting are as expected, since I once encountered iRST update that screwed that up – actually turned off RAID!
  • During boot hold F8
  • Choose Troubleshoot
  • Choose Advanced options
  • Choose Command Prompt, a command prompt window, as you know it, should open
  • My system drive was mounted as E:, if yours is mounted elsewhere, replace E: in below commands with that latter
  • At the prompt type:
    • cd /d E:\Windows\system32\drivers
    • ren iaStorA.sys iaStorA.sys-bad-version
    • cd ..
    • dir /s iaStorA.sys
  • That will list something like this:
     Volume in drive C has no label.
    Volume Serial Number is XXXX-XXXX
    
     Directory of E:\Windows\System32\drivers
    
    07/29/2015  19:44         1,462,720 iaStorA.sys
                   1 File(s)      1,462,720 bytes
    
     Directory of E:\Windows\System32\DriverStore\FileRepository\iastorac.inf_amd64_26544f4e51074f52
    
    05/28/2014  10:10           672,104 iaStorA.sys
                   1 File(s)        672,104 bytes
    
     Directory of E:\Windows\System32\DriverStore\FileRepository\iastorac.inf_amd64_61378e65f4f142a0
    
    07/29/2015  19:44         1,462,720 iaStorA.sys
                   1 File(s)      1,462,720 bytes
    
         Total Files Listed:
                   3 File(s)      2,806,928 bytes
     
     
    
  • For me the previous working driver file is apparently at DriverStore\FileRepository\iastorac.inf_amd64_26544f4e51074f52, yours can be elsewhere, so update the source directory in the copy command below according that
  • Continue typing following commands, you should still be at the E:\Windows\System32 directory:
    • copy DriverStore\FileRepository\iastorac.inf_amd64_26544f4e51074f52\iaStorA.sys drivers\iaStorA.sys
  • You should see a “1 file(s) copied” message
    • exit
  • Now normally reboot
  • Good luck!

After that Intel RST service still starts up, shows the status etc, disks are fine, everything seems to work, even after another (deliberate) reboot. Only weird thing was that System Restore for the C: drive was turned off. Not sure if it was caused by the RST update, by the boot problems, by some of my other manual changes (not listed here) or what.. Re-enabling it works fortunately well, so not an issue.

A good thing to do is yet to rollback the driver from the Device Manager (Storage Controllers/Intel(R) *** SATA RAID Controller/Properties/Driver/Roll Back Driver) to put the registry records back to the correct state.

Conclusion: Don’t update the Intel Rapid Storage Technology driver!

String parsing made simple with mozilla::Tokenizer

28th July, 2015

 

PL_strstr

 

I can see FindChar, Substring, ToInteger and even atoi, strchr, strstr and sscanf craziness all over the Mozilla code base. There are though much better and, more importantly, safer ways to parse even a very simple input.

I wrote a parser class with API derived from lexical analyzers that helps with simple inputs parsing in a very easy way. Just include mozilla/Tokenizer.h and use class mozilla::Tokenizer. It implements a subset of features of a lexical analyzer.  Also nicely hides boundary checks of the input buffer from the consumer.

To describe the principal briefly: Tokenizer recognizes tokens like whole words, integers, white spaces and special characters.  Consumer never works directly with the string or its characters but only with pre-parsed parts (identified tokens) returned by this class.

 

There are two main methods of Tokenizer:

  • bool Next(Token& result);

If there is anything to read from the input at the current internal read position, including the EOF, returns true and result is filled with a token type and an appropriate value easily accessible via a simple variant-like API.  The internal read cursor is shifted to the start of the next token in the input before this method returns.

  • bool Check(const Token& tokenToTest);

If a token at the current internal read position is equal (by the type and the value) to what has been passed in the tokenToTest argument, true is returned and the internal read cursor is shifted to the next token.  Otherwise (token is different than expected) false is returned and the read cursor is left unaffected.

Few usage examples:

Construction

  #include "mozilla/Tokenizer.h"

  mozilla::Tokenizer p(NS_LITERAL_CSTRING("Sample string 2015."));

Reading a single token, examining it

  mozilla::Tokenizer::Token t;
  bool read = p.Next(t);
  // read == true, we have read something and t has been filled
  // Following our example string...
  if (t.Type() == mozilla::Tokenizer::TOKEN_WORD) {
    t.AsString(); // returns "Sample"
  }

Checking on a token value and automatically skipping on a positive test

  if (!p.CheckWhite()) {
    throw "I expect a space here!";
  }

  read = p.Next(t);
  // read == true
  t.Type() == mozilla::Tokenizer::TOKEN_WORD;
  t.AsString() == "string";

  if (!p.CheckWhite()) {
    throw "A white space is expected here!";
  }

Reading numbers

  read = p.Next(t);
  // read == true
  t.Type() == mozilla::Tokenizer::TOKEN_INTEGER;
  t.AsInteger() == 2015;

Reaching the end of the input

  read = p.Next(t);
  // read == true
  t.Type() == mozilla::Tokenizer::TOKEN_CHAR;
  t.AsChar() == '.';

  read = p.Next(t);
  // read == true
  t.Type() == mozilla::Tokenizer::TOKEN_EOF;

  read = p.Next(t);
  // read == false, we are behind the EOF
  // t is here undefined!

More features

To learn more enhanced features of the Tokenizer – there is not that many, don’t be scared 😉 – look at the well documented Tokenizer.h file under xpcom/ds.

As a teaser you can go through this more enhanced example or check on a gtest for Tokenizer:

#include "mozilla/Tokenizer.h"

using namespace mozilla;

{
  // A simple list of key:value pairs delimited by commas
  nsCString input("message:parse me,result:100");

  // Initialize the parser with an input string
  Tokenizer p(input);
  // A helper var keeping type and value of the token just read
  Tokenizer::Token t;

  // Loop over all tokens in the input
  while (p.Next(t)) {
    if (t.Type() == Tokenizer::TOKEN_WORD) {
      // A 'key' name found
      if (!p.CheckChar(':')) {
        // Must be followed by a colon
        return; // unexpected character
      }

      // Note that here the input read position is just after the colon
      // Now switch by the key string
      if (t.AsString() == "message") {
        // Start grabbing the value
        p.Record();
        // Loop until EOF or comma
        while (p.Next(t) && !t.Equals(Tokenizer::Token::Char(',')))
          ;
        // Claim the result
        nsAutoCString value;
        p.Claim(value);
        MOZ_ASSERT(value == "parse me");

        // We must revert the comma so that the code bellow recognizes the flow correctly
        p.Rollback();
      } else if (t.AsString() == "result") {
        if (!p.Next(t) || t.Type() != Tokenizer::TOKEN_INTEGER) {
          return; // expected a value and that value must be a number
        }

        // Get the value, here you know it's a valid number
        uint32_t number = t.AsInteger();
        MOZ_ASSERT(number == 100);
      } else {
        // Here t.AsString() is any key but 'message' or 'result', ready to be handled
      }

      // On comma we loop again
      if (p.CheckChar(',')) {
        // Note that now the read position is after the comma
        continue;
      }
      // No comma?  Then only EOF is allowed
      if (p.CheckEOF()) {
        // Cleanly parsed the string
        break;
      }
    }

    return; // The input is not properly formatted
  }
}

 

Currently works only with ASCII inputs but can be easily enhanced to also support any UTF-8/16 coding or even specific code pages if needed.

TCPSocket.js/TCPServerSocket.js IPC mess captured

15th July, 2015

TCPSocket implemented in Javascript is a DOM technology providing web pages a direct access to TCP network sockets.  We support both outgoing connections and listening to incoming connections via a server socket.

I’m constantly requested to review changes under this code in /dom/network where TCPSocket et al resides.  And I’m always lost in the mess of all classes and objects involved in the IPC bridging.

As a side result of the last review request I’ve dived into the jungle and crated a “UML” flow chart that helps me understand – and not forget again – the complicated flow of IPC’ing in both TCPSocket and TCPServerSocket.

Here they are.

TCPSocket IPCTCPServerSocket IPC

Not perfect, I’m no UML expert, but I think one can understand and make a picture that may be helpful 😉

Unseen astronomical phenomenon discovered

21st June, 2015

A new bright nebula found very close to a water surface.

Nebula or a wooden stick?

Who recognize what this actually is? 😉

New Gecko performance tool: Backtrack

9th June, 2015

Backtrack aims to show a complete code path flow from any point back to its source, crossing asynchronous callbacks, threads, processes, network requests, timers and any kind of implementation specific queuing plus capturing any I/O or mutex blockade.  The ‘critical flow execution path’ is put to a context of all the remaining concurrent execution flows.  It’s then easy to examine how the critical flow is blocked and delayed by concurrent tasks.

The work is tracked in this bug, where you also find patches and build instructions.  There is also an add-on that, in Backtrack enabled builds, allows you to view actual captured data.

Click the screenshot bellow to view an interactive previewIt’s capture of load of my blog main page till the first-paint notification (no e10s and no network predictor to demonstrate the capture capabilities.)

backtrack-preview-1

Backtrack combines*) Gecko Profiler and Task Tracer.

Gecko Profiler (PSP) provides instrumentation (already spread around the code base) to capture static call stacks.  I’ve enhanced the PSP instrumentation to also capture objects (i.e. 'this' pointer value) and added a simple base class to easily monitor object life time (classes must be instrumented.)

Task Tracer (TT) on the other hand provides a generic way to track back on runnables – but not on e.g. network poll results, network requests or implementation specific queues.  It was easy to add a hook into the TT code that connects the captured object inter-calls with information about runnables dispatch source and target.

The Backtrack experimental patch:

  • Captures object lifetime (simply add ProfilerTracked<class Derived> as a base class to track the object lifetime and class name automatically)
  • Annotates objects with resource names (e.g URI, host name) they work with at run-time
  • Connects stack and object information using the existing PROFILER_LABEL_FUNC instrumentation recording this pointer value automatically ; this way it collects calls between objects
  • Measures I/O and mutex wait time ; an object holding a lock can be easily found
  • Sticks receipt of a particular network response exactly to its actual request transmission (here I mainly mean HTTP but also applies to connect() and DNS resolution)
  • Joins network polling “ins” and “outs”
  • Binds code-specific queuing and dequeuing, like our DNS resolver, HTTP request queues.  Those are not just ‘dispatch and forget’ like nsIEventTarget and nsIRunnable but rather have priorities, complex dequeue conditions and may not end up dispatched to just a single thread.  These queues are very important from the resource scheduling point of view.

Followups:

  • IPC support, i.e. cross also processes
  • Let the analyzes also mark anything ‘related’ for achieving a selected path end (e.g. my favorite first-paint time and all CSS loads involved)
  • Probably persist the captured raw logs and allow the analyzes be done offline

Disadvantages: just one – significant memory consumption.

*) The implementation is so far not deeply bound to SPS and TT memory data structures.  I do the capture my own – actually a third data collection, side by SPS and TT.  I’m still proving the concept this way but if found useful and bearable to land in this form as a temporary way of collecting the data, we can optimize and cleanup as a followup work.

Částečné zatmění slunce 2015

20th March, 2015

Je to hodně z ruky, selhala dálková programovatelná spoušť a na focení po přesných intervalech jsem byl prostě líný 🙂 Aligning není úplně přesný, ale mě se to líbí i tak.

Stativ, Canon EOS 60D, Canon EF 200/2.8 L II, ND8 + Baader Astrosolar.  Každý snímek animace cca 6 – 10 RAW obrazů @ ISO 100, 1/125s, F/4, bez flatfield.  Registax 6.

partial-solar-eclipse-2015-animation

Just a photo…

23rd January, 2015

Misty and cloudy field

Highslide for Wordpress Plugin