Building mozilla code directly from Visual Studio IDE

29th November, 2013

Yes, it’s possible!  With a single key press you can build and have a nice list of errors in the Error List window, clickable to get to the bad source code location easily.  It was a fight, but here it is.  Tested with Visual Studio Express 2013 for Windows Desktop, but I believe this all can be adapted to any version of the IDE.

• Create a shell script, you will (have to) use it every time to start Visual Studio from mozilla-build’s bash prompt:

export MOZ__INCLUDE=$INCLUDE export MOZ__LIB=$LIB export MOZ__LIBPATH=$LIBPATH export MOZ__PATH=$PATH export MOZ__VSINSTALLDIR=$VSINSTALLDIR # This is for standard installation of Visual Studio 2013 Desktop, alter the paths to your desired/installed IDE version cd "/c/Program Files (x86)/Microsoft Visual Studio 12.0/Common7/IDE/" ./WDExpress.exe & • Create a solution ‘mozilla-central’ located at the parent directory where your mozilla-central repository clone resides. Say you have a structure like C:\Mozilla\mozilla-central, which is the root source folder where you find .hg, configure.in and all the modules’ sub-dirs. Then C:\Mozilla\ is the parent directory. • In that solution, create a Makefile project ‘mozilla-central’, again located at the parent directory. It will, a bit unexpectedly, be created where you probably want it – in C:\Mozilla\mozilla-central. • Let the Build Command Line for this project be (use the multi-line editor to copy & paste: combo-like arrow on the right, then the <Edit…> command): call "$(MOZ__VSINSTALLDIR)\VC\bin\vcvars32.bat" set INCLUDE=$(MOZ__INCLUDE) set LIB=$(MOZ__LIB) set LIBPATH=$(MOZ__LIBPATH) set PATH=$(MOZ__PATH) set MOZCONFIG=c:\optional\path\to\your\custom\mozconfig cd $(SolutionDir) python mach --log-no-times build binaries  Now when you make a modification to a C/C++ file just build the ‘mozilla-central’ project to run the great build binaries mach feature and quickly build the changes right from the IDE. Compilation and link errors as well as warnings will be nicely caught in the Error List. BE AWARE: There is one problem – when there is a typo/mistake in an exported header file, it’s opened as a new file in the IDE from _obj/dist/include location. When you miss that and modify that file it will overwrite on next build! (I’ll ask Bas Schouten as Chris Pearce suggests if there is some solution.) With these scripts you can use the Visual Studio 2013 IDE but build with any other version of VC++ of your choice. It’s independent, just run the start-up script from different VS configuration mozilla-build prompt. I personally also create projects for modules (like /netwerk, /docshell, /dom) I often use. Just create a Makefile project located at the source root directory with name of the module directory. The project file will then be located in the module – I know, not really what one would expect. Switch Solution Explorer for that project to show all files, include them all in the project, and you are done. Few other tweaks: • Assuming you properly use an object dir, change the Output Directory and Intermediate Directory to point e.g. to $(SolutionDir)\<your obj dir>\$(Configuration)\. The logging and other crap won’t then be created in your source repository. • Add: ^.*\.vcproj.* ^.*\.vcxproj.* .sln$ .suo$.ncb$ .sdf$.opensdf$
to your custom hg ingnore file to prevent the Visual Studio project and solution files interfere with Mercurial.  Same suggested for git, if you prefer it.

Note: you cannot use this for a clobbered build because of an undisclosed Python Windows-specific bug.  See here why.  Do clobbered builds from a console, or you may experiment with clobber + configure from a console and then build from the IDE.

QueryPerformanceCounter calibration with GetTickCount

14th November, 2013

In one of my older posts I’m describing how the Mozilla Platform decides on whether this high precision timer function is behaving properly or not.  That algorithm is now obsolete and we have a better one.

The current logic, that seems proven stable, is using a faults-per-tolerance-interval algorithm, introduced in bug 836869 – Make QueryPerformanceCounter bad leap detection heuristic smarter.  I decided to use such evaluation since the only real critical use of the hi-res timer is for animations and video rendering where large leaps in time may cause missing frames or jitter during playback.  Faults per interval is a good reflection of stability that we want to ensure in reality.  QueryPerformanceCounter is not perfectly precise all the time when calibrated against GetTickCount while it doesn’t always need to be considered a faulty behavior of QueryPerformanceCounter result.

The improved algorithm

There is no need for a calibration thread or a calibration code as well as any global skew monitoring.  Everything is self-contained.

As the first measure, we consider QueryPerformanceCounter as stable when TSC is stable, meaning it is running at a constant rate during all ACPI power saving states [see HasStableTSC function]

When TSC is not stable or its status is unknown, we must use the controlling mechanism.

Definable properties

• what is the number of failures we are willing to tolerate during an interval, set at 4
• the fault-free interval, we use 5 seconds
• a threshold that is considered a large enough skew for indicating a failure, currently 50ms

Fault-counter logic outline

• keep an absolute time checkpoint, that shifts to the future with every failure by one fault-free interval duration, base it on GetTickCount
• each call to Now() produces a timestamp that records values of both QueryPerformanceCounter (QPC) and GetTickCount (GTC)
• when two timestamps (T1 and T2) are subtracted to get the duration, following math happens:
• deltaQPC = T1.QPC – T2.QPC
• deltaGTC = T1.GTC – T2.GTC
• diff = deltaQPC – deltaGTC
• if diff < 4 * 15.6ms: return deltaQPC ; this cuts of what GetTickCount’s low resolution unfortunately cannot cover
• overflow = diff – 4 * 15.6ms
• if overflow < 50ms (the failure threshold): return deltaQPC
• from now on, result of the subtraction is only deltaGTC
• fault counting part:
• if deltaGTC > 2000ms: return ; we don’t count failures when timestamps are more then 2 seconds each after other *)
• failure-count = max( checkpoint – now, 0 ) / fault-free interval
• if failure-count > failure tolerance count: disable usage of QueryPerformanceCounter
• otherwise: checkpoint = now + (failure-count + 1) * fault-free interval

You can check the code by looking at TimeStamp_windows.cpp directly.

I’m personally quite happy with this algorithm.  So far, no issues with redraw after wake-up even on exotic or older configurations.  Video plays smoothly, while we are having a hi-res timing for telemetry and logging where possible.

*) Reason is to omit unexpected QueryPerformanceCounter leaps from failure counting when a machine is suspended even for a short period of time

Mozilla Firefox new HTTP cache is live!

23rd September, 2013

The new Firefox HTTP cache back-end that keeps the cache content after a crash or a kill and doesn’t cause any UI hangs – has landed!

It’s currently disabled by default but you can test it by installing Firefox Nightly and enabling it. This applies to Firefox Mobile builds as well.  There is a preference that enables or disables the new cache, find it in about:config. You can switch it on and off any time you want, even during active browsing, there is no need to restart the browser to take the changes in effect:

browser.cache.use_new_backend

• 0 – disable, use the old crappy cache (files are stored under Cache directory in your profile) – now the default
• 1 – enable, use the brand new HTTP cache (files are stored under cache2 directory in your profile)

Other new preferences that control the cache behavior:

browser.cache.memory_limit

• number of kBs that are preserved in RAM tops to keep the most used content in memory, so page loads speed up
• on desktop this is now set to 50MB (i.e. 51’200kB)

There are still open bugs before we can turn this fully on.  The one significant is that we don’t automatically delete cache files when browser.cache.disk.capacity is overreached, so your disk can get flooded by heavy browsing.  But you still can delete the cache manually using Clear Recent History.

Enabling the new HTTP cache by default is planned for Q4/2013.  For Firefox Mobile it can even be sooner, since we are using the Android’s context cache directory that is automatically deleted when the storage gets out of space.  Hence, we don’t need to monitor the cache capacity our self on mobile.

Please report any bug you find during tests under Core :: Networking: Cache.

Appcache prompt removed from Firefox

19th August, 2013

The bothering prompt when a web app is using offline application cache (a.k.a appcache) has been removed from Firefox!

Beginning with Firefox 26 there will no more be this prompt users have to accept.  Firefox will cache the stuff automatically as if the user has clicked the Allow button.

This actually applies to every software based on Gecko, like Firefox Mobile or Firefox OS. Tracked in bug 892488.

Application cache, a not really favorite feature, is not that widely used on today web and one of the reasons has been this prompt.  It may be a little late in the game, but it still has happen.  I’m curious on what the feedback from web developers is going to be.

New Firefox HTTP cache backend – story continues

16th August, 2013

In my previous post I was writing about the new cache backend and some of the very first testing.

Now we’ve stepped further and there are significant improvements.  I was also able to test with more various hardware this time.

The most significant difference is a single I/O thread with relatively simple event prioritization.  Opening and reading urgent (render-blocking) files is done first, opening and reading less priority files after that and writing is performed as the last operation. This greatly improves performance when loading from non-warmed cache and also first paint time in many scenarios.

The numbers are much more precise then in the first post.  My measuring is more systematic and careful by now.  Also, I’ve merged gum with latest mozilla-central code few times and there are for sure some improvements too.

Here are the results, I’m using 50MB limit for keeping cached stuff in RAM.

[ complete page load time / first paint time ]

Old iMac with mechanical HDD
Backend First visit Warm go to 1) Cold go to 2) Reload
mozilla-central 7.6s / 1.1s 560ms / 570ms 1.8s / 1.7s 5.9s / 900ms
new back-end 7.6s / 1.1s 530ms / 540ms 2.1s / 1.9s** 6s / 720ms

Old Linux box with mechanical 'green' HDD
Backend First visit Warm go to 1) Cold go to 2) Reload
mozilla-central 7.3s / 1.2s 1.4s / 1.4s 2.4s / 2.4s 5.1s / 1.2s
new back-end 7.3s/ 1.2s
or** 9+s / 3.5s
1.35s / 1.35s 2.3s / 2.1s 4.8s / 1.2s

Fast Windows 7 box with SSD
Backend First visit Warm go to 1) Cold go to 2) Reload
mozilla-central 6.7s / 600ms 235ms / 240ms 530ms / 530ms 4.7s / 540ms
new back-end 6.7s / 600ms 195ms / 200ms 620ms / 620ms*** 4.7s / 540ms

Fast Windows 7 box and a slow microSD
Backend First visit Warm go to 1) Cold go to 2) Reload
mozilla-central 13.5s / 6s 600ms / 600ms 1s / 1s 7.3s / 1.2s
new back-end 7.3s / 780ms
or** 13.7s / 1.1s
195ms / 200ms 1.6 or 3.2s* / 460ms*** 4.8s / 530ms

To sum – most significant changes appear when using a really slow media.  For sure, first paint times greatly improves, not talking about the 10000% better UI responsiveness!  Still, space left for more optimizations.  We know what to do:

• deliver data in larger chunks ; now we fetch only by 16kB blocks, hence larger files (e.g. images) load very slowly
• think of interaction with upper levels by means of having some kind of an intelligent flood control

1) Open a new tab and navigate to a page when the cache is already pre-warmed, i.e. data are already fully in RAM.

2) Open a new tab and navigate to a page right after the Firefox start.

* I was testing with my blog home page.  There are few large images, ~750kB and ~600kB.  Delivering data to upper level consumers only by 16kB chunks causes this suffering.

** This is an interesting regression.  Sometimes with the new backend we delay first paint and overall load time.  Seems like the cache engine is ‘too good’ and opens the floodgate too much overwhelming the main thread event queue.  Needs more investigation.

*** Here it’s combination of flood fill of the main thread with image loads, slow image data load it self and fact, that in this case we first paint after all resources on the page loaded – that needs to change.  It’s also supported by fact that cold load first paint time is significantly faster on microSD then on SSD.  The slow card apparently simulates the flood control here for us.

New Firefox HTTP cache backend, first impressions

9th July, 2013

After some two months of coding me and Michal Novotný are closing to have first “private testing” stable enough build with new and simplified HTTP cache back-end.

The two main goals we’ve met are:

• Be resilient to crashes and process kills
• Get rid of any UI hangs or freezes (a.k.a janks)

We’ve abandoned the current disk format and use separate file for each URL however small in size it is.  Each file is using self-control hashes to check it’s correct, so no fsyncs are needed.  Everything is asynchronous or fully buffered.  There is a single background thread to do any I/O like opening, reading and writing.  On Android we are writing to the context cache directory.  This way the cached data are actually treated as that.

I’ve performed some first tests using http://janbambas.cz/ as a test page.  Currently as I write this post there are some 460 images.  Testing was made on a relatively fast machine, but important is to differentiate on the storage efficiency.  I had two extremes available: an SSD and an old and slow-like-hell microSD via a USB reader.

Testing with a microSD card:

mozilla-central 16s 7s
new back-end 12s 4.5s
mozilla-central 7s 700ms
new back-end 5.5s 500ms
Type URL and go, cached and warmed
mozilla-central 900ms 900ms
new back-end 400ms 400ms
Type URL and go, cached but not warmed
mozilla-central 5s 4.5s
new back-end ~28s 5-28s

*) Here I’m getting unstable results.  I’m doing more testing with having more concurrent open and read threads.  It seems there is not that much effect and the jitter in time measurements is just a noise.

I will report on thread concurrent I/O more in a different post later since I find it quite interesting space to explore.

Clearly the cold “type and go” test case shows that blockfiles are beating us here.  But the big difference is that UI is completely jank free with the new back-end!

Testing on an SSD disk:

The results are not that different for the current and the new back-end, only a small regression in warmed and cold “go to” test cases:

Type URL and go, cached and warmed
mozilla-central 220ms 230ms
new back-end 310ms 320ms
Type URL and go, cached but not warmed
mozilla-central 600ms 600ms
new back-end 1100ms 1100ms

Having multiple threads seems not to have any affect as far as precision of my measurements goes.

At this moment I am not sure what causes the regression for both the “go to” cases on an SSD, but I believe it’s just a question of some simple optimizations, like delivering more then just 4096 bytes per a thread loop as we do now or a fact we don’t cache redirects – it’s a known bug right now.

Still here and want to test your self? Test builds can be downloaded from ‘gum’ project treeDisclaimer: the code is very very experimental at this stage, so use at your own risk!

C/2011 L4 (PanSTARRS)

22nd May, 2013

C/2011 L4 PanSTARRS v blízkosti hvězdy Errai (ɣ Cep) v souhvězdí Cepheus.

16.5.2013 1:48 – 2:48  SELČ
Jižně od Prahy
Canon 60d + CLS CCD Clip
Canon 200mm/f2.8 L II USM
HEQ5, bez pointace
8x5min 800 ISO @ f3.5
Flat 8x, Bias 5x (bohužel focené dodatečně doma, zapomněl sem si flat field..)

Později přišli mraky, takže fotka je trochu rozmazaná.

DeepSkyStacker 3.3.2, stack na kometu dle tohoto návodu.  I přes to ale nejsou dráhy hvězd vidět.  Při bližším pohledu jsou jen znatelné duchy.  Doladění PhotoShop (32bit).

Fix: Lenovo ThinkPad fully recharges battery after restart despite charge thresholds

8th May, 2013

This is a ‘how to’ for battery threshold setting on Lenovo laptops with Windows 8 system when your battery recharges fully after the system restart.

I’ve recently updated my ancient Lenovo laptop to Windows 8.  To prolong lifespan of my battery I had to setup again the charge thresholds to not start charging the battery sooner then bellow 5% and stop up at 100% using PowerManager 6.36 with 1.66.0.22 PowerManager driver.

However, after system restart the battery started to charge fully every time – a way to significantly shorten it’s lifespan.

To fix it I did some search.  There are forum posts how to fix charge thresholds manually.  However, it didn’t work for me.  Changing ChargeStartControl, ChargeStartPercentage, ChargeStopControl, ChargeStopPercentage registry keys under HKEY_LOCAL_MACHINE\SOFTWARE\Lenovo\PWRMGRV\Data didn’t help.

The correct registry settings are located elsewhere:

HKEY_CURRENT_USER\Software\Lenovo\PWRMGRV\Data\<eleven alphanums code>\

The “code” seems to be something random and looks e.g. like 1983HD83HM7. This is just an example, it will for sure be different on your machine!

You may need to manually create any missing DWORD values, in my case those are (in hex):
ChargeStartControl: 0×00000001
ChargeStartPercentage: 0×00000004
ChargeStopControl: 0×00000001
ChargeStopPercentage: 0×00000064

After you do the changes, restart your machine.

These settings then produce the following status that persists after system restart:

Million Marihuana March 2013

8th May, 2013

A zase rok v pr..li

Loňský ročník Million Marihuana March 2012 se moc nelišil, jen možná více lepšího vegan a vegetariánského jídla.

Firefox detailed event tracer – about:timeline

19th April, 2013

I always wanted to see all the little things that happen when loading a web page in Firefox.  I know how it works well as a Gecko developer, but to actually see the interaction and chaining is not at all always obvious, unless you study crazy NSPR logs.  Hence, I started a development of a tool, called event tracer, to get a timeline showing the hidden guts.

Example screenshot will tell the story the best way, a trial run of www.mozilla.org load:

Planning improvements

At this time the work is not complete.  There is a lot more to make this an actually useful development tool and my next steps are to hunt those requirements down.

I am using about:timeline to verify patches I review do what they intend to do.  It can also help find hidden bugs.

However, using about:timeline for discovery of perfomance suboptimal code paths showed up being not that simple.  Events are just spread all over around and connection between them is not easily, if at all, discoverable.

Hence, this needs more thinking and work.

First thoughts to improve are to more focus on “the resource” and a single object dealing with it.  It might be better to show what all events are happening on e.g. a single instance of an http channel or http transactions then to tortuously hunt them somewhere in the graph.  There is some simple way to highlight and filter, but that is not enough for an analytical view.

Then, I’m missing a general way to easily recognize how things are chained together.  So, I’d like to link events that are coming one from another (like http channel creates an http transaction, then a connection etc.) and present the timeline more like a gantt chart plus show a critical path or flow for any selected pass through.

From inspecting the timeline it should be visible where bottlenecks and long wait times worth fixing are.  At this time I don’t have a complete clear plan on this, though.

Still here?  Cool   If you have any though or ideas for how to use the data we collect and visualize as a valuable source for the performance optimization surgery please feel free to share them here.  More on how the timeline data are produced check bellow.

How it works

This event track on the image is produced with a special code instrumentation.  To get for instance “net::http::transaction” traces, following 3 places of the code have been instrumented (in blue):

1. WAIT – record time when an http transaction is scheduled:

nsresult
nsHttpTransaction::Init(uint32_t caps,
nsHttpConnectionInfo *cinfo,
nsIInputStream *requestBody,
nsIEventTarget *target,
nsIInterfaceRequestor *callbacks,
nsITransportEventSink *eventsink,
nsIAsyncInputStream **responseBody)
{
MOZ_EVENT_TRACER_COMPOUND_NAME(static_cast<nsAHttpTransaction*>(this),

MOZ_EVENT_TRACER_WAIT(static_cast<nsAHttpTransaction*>(this),
"net::http::transaction");

2. EXEC – record time when the transaction first comes to an action, it is the time it gets a connection assigned and starts it’s communication with the server:

void
nsHttpTransaction::SetConnection(nsAHttpConnection *conn)
{
NS_IF_RELEASE(mConnection);

if (conn) {
MOZ_EVENT_TRACER_EXEC(static_cast<nsAHttpTransaction*>(this),
"net::http::transaction");
}

3. DONE – record time when the transaction has finished it’s job by completing the response fetch:

nsHttpTransaction::Close(nsresult reason)
{
LOG(("nsHttpTransaction::Close [this=%x reason=%x]\n", this, reason));

...

MOZ_EVENT_TRACER_DONE(static_cast<nsAHttpTransaction*>(this),
"net::http::transaction");
}

The thread timeline where an event is finally displayed is the thread where EXEC code has been called on.

What is the exact definition of the WAIT and EXEC phase is up to the developer now.  For me the WAIT phase is any time an operation is significantly blocked before it can be carried out, it’s the time having the main performance affect we may be in particular interested in shortening.  Few examples:

• time spent in a thread’s event queue – duration from the dispatch to the run
• time spent waiting for an asynchronous callback such as reading from disk or network
• time waiting for necessary resources, such as an established TCP connection before an object can proceed with its job
• time spent waiting for acquirement of a lock or a monitor

How to bind a URL or any identifying info to an event

The following instrumentation is used (in red):

nsresult
nsHttpTransaction::Init(uint32_t caps,
nsHttpConnectionInfo *cinfo,
nsIInputStream *requestBody,
nsIEventTarget *target,
nsIInterfaceRequestor *callbacks,
nsITransportEventSink *eventsink,
nsIAsyncInputStream **responseBody)
{
MOZ_EVENT_TRACER_COMPOUND_NAME(static_cast<nsAHttpTransaction*>(this),

MOZ_EVENT_TRACER_WAIT(static_cast<nsAHttpTransaction*>(this),
"net::http::transaction");

Here the http transaction event is set a host + path of the resource it loads.

The object’s this pointer, that needs to be properly cast by the developer, is what sticks all together.  This is the main difference from how usual profiling tools work.  Event timeline is providing a view of event chaining crossing thread and method boundaries, and not just a pure stack trace.

View details of a tracked event

Each event track is bound with e.g. a URL it deals with, where applicable.  You can inspect the URL (the associated resource) and more details when an event is clicked on:

Wait between is track of the time the event spent waiting, i.e. the time the event has been scheduled and time of the execution start, both since the time tracking has been turned on.  The number in parentheses is simply the wait phase duration.

Execution is track of time spent by execution, i.e. when the intended job it self has started and when it has actually finished.  The parenthesized number is how long the job execution took.

Posted from is name of the thread the event has been scheduled at.

The Filter button is used to quickly filter this particular event plus it’s sister events out.  How it work is described bellow.

The Zero time button is used to shift the “time 0.000″ of the timeline to the start time of the inspected event, so that you can inspect recorded timing of other events relative to this one particular.

mxr link will open results of search for the event name in the code.  This way you can quickly inspect how this event timing is actually instrumented and collected right in the code.

Filtering timeline events

You can filter events using two filtering functions:

• By type of an event (e.g. “net::nttp::transaction”, “docshell::pageload” etc.)
• By name of a resource an event has been associated with (e.g. “www.mozilla.org”, “/favicon.ico” etc…)

Filter by type – currently there are following event types so far implemented (instrumented).  Yyou get this check box list after the tracer run when you click filter events at the top bar:

Each event has a namespace, e.g. for “net::http::transaction” the namespaces are “net::” and “net::http::”.  You can turn on or off the whole set of events in a namespace easily.  Currently there are only “net::” and “docshell::” top level namespaces worth mentioning.

Filtering by resource, i.e. usually the URL a particular event or set of events have been bound to, is also possible when you click filter resource:

You can enter the filter string manually or use the provided autocomplete.  The by-the-resource filtering works as follows:

1. we inspect whether the string set as the filter is a substring of the event resource string
2. we inspect whether the event resource string is a substring of the string set as the filter

If one of these conditions passes, we display the event, otherwise, we hide it.

This way when you enter “http://www.mozilla.org/favicon.ico” as a filter, you see the http channel and transaction for this resource load, as well as any DNS and TCP connection setups related to “www.mozilla.org”.

• Create an optimized build of Firefox with --enable-visual-event-tracer configure option
• Run Firefox
• Press the orange [ Start ] button, you get the message the trace is running
• Proceed with your intended actions, e.g. a page load, and let it finish
• Wait a little to get your events timeline details

So, here it is.  It’s a work in progress tho.  I’ll keep updates.