Jump to navigation

Poupou's Corner of the Web

Looking for perfect security? Try a wireless brick.
Otherwise you may find some unperfect stuff here...

Weblog

Reminder: You Rule Day!

Tomorrow, Saturday April 26th, is the first (and best yet ;-) Gendarme Rule Day.

Gendarme hackers will be online, #gendarme on GIMPnet, for a great day of hacking with you to create the new rules. Bring your ideas, drinks, food and join us for the ride.

Here's Nestor original announcement for more details about the event.


4/25/2008 09:20:21 | Comments | Permalink

Oops, I did it again

Google Summer of Code 2008 has taken another step forward and published the list of projects they will be funding this summer. I'm glad that Nestor will be back with us to work (well he never left but I'm still glad) on Gendarme again this year. Speaking of Nestor, he's the great mind behind the Gendarme Rule Day idea.

Reminder: this is next Saturday (April 26th). Don't miss it!

So no one applied on the optimization projects but like I said there is a lot of things do to and, best of all, results are quickly visible. In fact I had already spotted a nice candidate, inside Cecil, last time: Mono.Cecil.Cil.OpCode but decided, at the time, to wait for a few, or a lone, candidate(s). GSoC proposals ended and I decided to give it a go myself (already two weeks ago). Again I created a base line (using the same configuration as before). Total memory required: 796289 KB.

Then I applied the same tricks as S.R.E.OpCode use (inside corlib): keep all enums into bytes and use the properties to typecast the byte back into an enum at runtime (i.e. no impact on API). The structure went from it's original 36 bytes, which is way too large for a structure (which is also why I, well Gendarme, spotted it) to 8 bytes. Total memory went down to 731768 KB (44% less memory on Mono.Cecil.Cil.Instruction allocations, 8% less total).

I also played with a smaller version: 2 bytes for the OpCode and everything else inside tables. I did not expect miracles here since it's too small an amount to be allocated nicely (e.g. alignment). Total memory went down to 722565 KB (50% less memory on Mono.Cecil.Cil.Instruction allocations, 9% less total). However it's performance was twice as slow (using tables againt the casts of the previous version) so this was not the version selected to go in SVN.

When you keep looking at logs, even old ones, you eventually see new things in them. I knew most memory was inside Hashtable, tried to minimize and reuse them but I forgot, well didn't see, something: the biggest ones are temporary and consist of the offset and Instruction for the IL of each method and the second biggest are elsewhere but identical. The debugging symbols assemblies, Mono.Cecil.[Mdb|Pdb].dll, were doing the same, temporary hashtables, to help place their values.

Simply sharing the same hashtable between them brings down total memory from (new base line) 722970 KB to 647636 KB, a bit over 10% reduction.

It actually took a bit more time (but not much and with fewer changes) than my previous attempt but I still got (more than) another 15% decrease in Cecil memory usage (at least if you're using the symbol helpers). Anyway enough Cecil hacking for now, next weekend is on Gendarme (Rule Day) and it's time I get my good friend JB hacking (more) on it ;-)


4/21/2008 20:53:26 | Comments | Permalink

A bit of trivia

Trivia:
How many assemblies, let say from Mono's 2.0 profile, are using the C# lock statement (on anything) ?

Before answering let's talk about what else is going on with Gendarme ?

Ok, back to the question, why is this important ?

Simply because Gendarme has a rule, DoubleCheckLockingRule (actually this is one of the oldest rules, contributed by Aaron Tomb back in 2005), that checks them. Such rule needs to scan the IL of every method and this takes time. The newer API allows rules to short-circuit the every method (or type ...) if we can determine (or plant logic), at initialization time, that this is not worth it.

Trivia Answer:
Out of 72 (2.0) assemblies I have 42 (58%) of them don't refer to System.Threading.Monitor, which is used to implement C# lock statement. This number is not correct since mscorlib.dll does not refer, but define, the type. So 31 (2.0) assemblies (43%) actually uses lock.

Practically this answer can be translated in C# like this:

public override void Initialize (IRunner runner) { base.Initialize (runner); // is this module using Monitor.Enter ? (lock in c#) // if not then this rule does not need to be executed for the module // note: mscorlib.dll is an exception since it defines, not refer, System.Threading.Monitor Runner.AnalyzeModule += delegate (object o, RunnerEventArgs e) { Active = (e.CurrentAssembly.Name.Name == Constants.Corlib) || e.CurrentModule.TypeReferences.ContainsType ("System.Threading.Monitor"); }; }

Similar events exists for assemblies, types and methods. They can even be chained when necessary.

How much time do we gain ? The original rule executed on all 2.0 (72) assemblies took 12.225839 seconds. Adding this method reduce the execution time to 9.122709 seconds (25% less time). That's a lot considering that all 72 assemblies are still loaded by the runner (thru Cecil).

The original reason I looked back to this rule was that it called OpCode.Equals(object) which required casting and boxing while the == operator was much better suited (a little fact found by Gendarme itself). Also since this is an old rule some of the stuff, like using non-generic collections, were easy to update. Others things like avoiding, as much as possible, memory allocations were also possible.

The refresh of the rule brings down the execution time to 8.455765 seconds (30% less time than the original, but a very small gain, around 7%, wrt the previous optimization). In fact removing the Initialize from this version shows 11.34947 seconds (again 7%), less than one second under the original time.

Using Initialize is much easier, safer (unlikely to break anything) and give much better result than refactoring rules source code to be more optimal. Other rules needs to be reviewed to see if the same hack can be applied to them. It's a low hanging fruit for anyone who wants to optimize Gendarme.


4/7/2008 20:58:54 | Comments | Permalink

The views expressed on this website/weblog are mine alone and do not necessarily reflect the views of my employer.