.NET Forum / .NET Framework / Performance / October 2004
How to switch on register optimization for JIT-compiler by default?
|
|
Thread rating:  |
John Demigor - 11 Oct 2004 15:12 GMT Just installed .NET Framework 1.1 and VS.NET 2003. Wrote a simple program:
[DllImport("kernel32.dll")] public static extern int GetTickCount();
int a = 0;
private void someMethod() { a = a + a; }
private void button1_Click(object sender, System.EventArgs e) { int a = GetTickCount(); for (int x = 0; x < 1000000000; x++) someMethod(); a = GetTickCount() - a; Text = a.ToString(); }
I have AMD Athlon XP 2600+ (1.9GHz). This program compiled in Release more works 19 seconds. The same code in Delphi 5 works 3 seconds, 2 seconds in Java 1.4. What's wrong with .NET?
After analyzing the machine code produced by JIT compiler everything becomes clear. .NET has no register optimization!!! The loop variable x - is used via stack reference and not assigned to register variable! Here are the opcodes produced for loop:
0000002f mov ecx,esi 00000031 call dword ptr ds:[00975A00h] ; Calling someMethod 00000037 inc dword ptr [ebp-10h] ; x++ 0000003a cmp dword ptr [ebp-10h],3B9ACA00h ; if x < 1kkk 00000041 jl 0000002F ; jump
Method someMethod opcodes:
00000000 push ebp 00000001 mov ebp,esp 00000003 push eax 00000004 push esi 00000005 mov esi,ecx 00000007 mov eax,dword ptr [esi+000000E4h] 0000000d add dword ptr [esi+000000E4h],eax 00000013 nop 00000014 pop esi 00000015 mov esp,ebp 00000017 pop ebp 00000018 ret
I dont know why, but there is simply NO ANY KIND of OPTIMIZATION available in this code. Maybe I need to switch it on somehow? If so, why it is not on by default?
Does anybody know?
Jon Skeet [C# MVP] - 11 Oct 2004 15:31 GMT > Just installed .NET Framework 1.1 and VS.NET 2003. Wrote a simple program: > [quoted text clipped - 19 lines] > more works 19 seconds. The same code in Delphi 5 works 3 seconds, 2 > seconds in Java 1.4. What's wrong with .NET? Not sure, but the following program runs in just over 3 seconds on my laptop, as does the equivalent in Java.
using System;
public class Test { int a=0; static void Main() { Test t = new Test(); DateTime start = DateTime.Now; for (int x = 0; x < 1000000000; x++) t.SomeMethod(); DateTime end = DateTime.Now; Console.WriteLine (end-start); } void SomeMethod() { a=a+a; } }
Then again, I got rather different results in the JIT compiled machine code, where the main part is:
[0033] xor edx,edx [0035] mov eax,dword ptr [esi+4] [0038] add dword ptr [esi+4],eax [003b] inc edx [003c] cmp edx,3B9ACA00h [0042] jl FFFFFFF3
I got my disassembly using cordbg, with mode JitOptimizations=1, after building the code without debug information. How did you get yours?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet If replying to the group, please do not mail me too
Stefan Simek - 11 Oct 2004 15:47 GMT It has something to do sith the WinForms... I've tried the exact code John posted, and got 20.7 seconds as a result. Writing the same application as Console Application yielded 2.25 second.
Looking at the disassembly, the call to the function got inlined only in the ConsoleApplication:
00000013 xor edx,edx 00000015 mov eax,dword ptr [esi+4] 00000018 add dword ptr [esi+4],eax 0000001b inc edx 0000001c cmp edx,3B9ACA00h 00000022 jl 00000015
In the WindowsApplication, the jitter used registers, but the call didn't become inlined:
00000014 xor edi,edi 00000016 mov ecx,esi 00000018 call dword ptr ds:[009859A0h] 0000001e inc edi 0000001f cmp edi,3B9ACA00h 00000025 jl 00000016
someMethod:
00000000 mov eax,dword ptr [ecx+000000E4h] 00000006 add dword ptr [ecx+000000E4h],eax 0000000c ret
I really don't understand why this has happened, as the IL for someMethod is only 20 bytes and I thought the limit for inlining is supposed to be 32 bytes of IL...
Stefan
>> Just installed .NET Framework 1.1 and VS.NET 2003. Wrote a simple >> program: [quoted text clipped - 57 lines] > I got my disassembly using cordbg, with mode JitOptimizations=1, after > building the code without debug information. How did you get yours? Jon Skeet [C# MVP] - 11 Oct 2004 16:03 GMT > It has something to do sith the WinForms... I've tried the exact code John > posted, and got 20.7 seconds as a result. Writing the same application as [quoted text clipped - 29 lines] > only 20 bytes and I thought the limit for inlining is supposed to be 32 > bytes of IL... Ah, I remember something now - I believe MarshalByRefObject makes a big difference. Changing my console app so that Test derives from MarshalByRefObject changes the timing to over 15s instead of 3s.
Someone gave me a reason for this at some point, but I wasn't entirely convinced it was necessary... I'll be interested to see if .NET 2.0 has the same "problem".
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet If replying to the group, please do not mail me too
Robert Jordan - 11 Oct 2004 15:35 GMT > I dont know why, but there is simply NO ANY KIND of OPTIMIZATION available in this code. > Maybe I need to switch it on somehow? If so, why it is not on by default? The JIT compiler doesn't optimize while the application is beeing debugged.
Insert the statement System.Diagnostics.Debugger.Break() somewhere after the calculation, start the app (but not from within the the IDE!!) and wait for the debugger to pop up.
You'll get a better performance then from the Delphi compiler.
bye Rob
John Demigor - 11 Oct 2004 15:59 GMT Here is the code, simply paste it, compile in Release mode, close IDE and run it directly. I couldn't get any register optimization on it.
I tried to dissassemble it by pressing Brake All button in VS.NET IDE. Can you reproduce it too? I still have .NET framework 1.1.
using System; using System.Drawing; using System.Collections; using System.ComponentModel; using System.Windows.Forms; using System.Data; using System.Runtime.InteropServices;
namespace WindowsApplication1 { /// <summary> /// Summary description for Form1. /// </summary> /// public class Form1 : System.Windows.Forms.Form { private System.Windows.Forms.Button button1; /// <summary> /// Required designer variable. /// </summary> private System.ComponentModel.Container components = null; public Form1() { // // Required for Windows Form Designer support // InitializeComponent(); // // TODO: Add any constructor code after InitializeComponent call // } [DllImport("kernel32.dll")] public static extern int GetTickCount(); /// <summary> /// Clean up any resources being used. /// </summary> protected override void Dispose( bool disposing ) { if( disposing ) { if (components != null) { components.Dispose(); } } base.Dispose( disposing ); } #region Windows Form Designer generated code /// <summary> /// Required method for Designer support - do not modify /// the contents of this method with the code editor. /// </summary> private void InitializeComponent() { this.button1 = new System.Windows.Forms.Button(); this.SuspendLayout(); // // button1 // this.button1.Location = new System.Drawing.Point(104, 112); this.button1.Name = "button1"; this.button1.TabIndex = 0; this.button1.Text = "Run Test"; this.button1.Click += new System.EventHandler(this.button1_Click); // // Form1 // this.AutoScaleBaseSize = new System.Drawing.Size(5, 13); this.ClientSize = new System.Drawing.Size(292, 272); this.Controls.Add(this.button1); this.Name = "Form1"; this.Text = "Here come the time"; this.ResumeLayout(false); } #endregion /// <summary> /// The main entry point for the application. /// </summary> [STAThread] static void Main() { Application.Run(new Form1()); } private int a = 0; private void someMethod() { a=a+a; } private void button1_Click(object sender, System.EventArgs e) { int a = GetTickCount(); for (int x = 0; x < 1000000000; x++) someMethod(); a = GetTickCount() - a; Text = a.ToString(); } } }
Robert Jordan - 11 Oct 2004 16:21 GMT > I tried to dissassemble it by pressing Brake All button in VS.NET IDE. Can > you reproduce it too? > I still have .NET framework 1.1. As I told you: don't use the IDE if you want to see how the IL gets optimized.
bye Rob
John Demigor - 11 Oct 2004 16:31 GMT > As I told you: don't use the IDE if you want to see how the IL > gets optimized. I use IDE just to compile exe-file in Release mode.
Did you pay attention to my post? I wrote "compile it, close IDE, then run". It takes 19 seconds anyway. That's the point.
Robert Jordan - 11 Oct 2004 18:51 GMT >>As I told you: don't use the IDE if you want to see how the IL >>gets optimized. [quoted text clipped - 3 lines] > Did you pay attention to my post? I wrote "compile it, close IDE, then run". > It takes 19 seconds anyway. That's the point. Forms are MarshalByRefObjects. These classes cannot be optimized like normal classe, because they are supposed to be remoted using a proxy. The proxy has to intercept method calls, which obviously doesn't work the the method has be inlined.
bye Rob
Robert Jordan - 11 Oct 2004 18:54 GMT > Forms are MarshalByRefObjects. These classes cannot > be optimized like normal classe, because they are > supposed to be remoted using a proxy. The proxy > has to intercept method calls, which obviously > doesn't work the the method has be inlined. correction: which obviously doesn't work when the method has been inlined.
bye Rob
John Demigor - 11 Oct 2004 15:52 GMT Here is the code, simply paste it and run in Release mode. Couldn't get any register optimization on it. Dissassembled by pressing Brake All button in VS.NET IDE. (Yes, I run it in IDE, but without IDE it takes the same time to complete anyway).
Can you reproduce it too? I have .NET framework 1.1.
using System; using System.Drawing; using System.Collections; using System.ComponentModel; using System.Windows.Forms; using System.Data; using System.Runtime.InteropServices;
namespace WindowsApplication1 { /// <summary> /// Summary description for Form1. /// </summary> /// public class Form1 : System.Windows.Forms.Form { private System.Windows.Forms.Button button1; /// <summary> /// Required designer variable. /// </summary> private System.ComponentModel.Container components = null; public Form1() { // // Required for Windows Form Designer support // InitializeComponent(); // // TODO: Add any constructor code after InitializeComponent call // } [DllImport("kernel32.dll")] public static extern int GetTickCount(); /// <summary> /// Clean up any resources being used. /// </summary> protected override void Dispose( bool disposing ) { if( disposing ) { if (components != null) { components.Dispose(); } } base.Dispose( disposing ); } #region Windows Form Designer generated code /// <summary> /// Required method for Designer support - do not modify /// the contents of this method with the code editor. /// </summary> private void InitializeComponent() { this.button1 = new System.Windows.Forms.Button(); this.SuspendLayout(); // // button1 // this.button1.Location = new System.Drawing.Point(104, 112); this.button1.Name = "button1"; this.button1.TabIndex = 0; this.button1.Text = "Run Test"; this.button1.Click += new System.EventHandler(this.button1_Click); // // Form1 // this.AutoScaleBaseSize = new System.Drawing.Size(5, 13); this.ClientSize = new System.Drawing.Size(292, 272); this.Controls.Add(this.button1); this.Name = "Form1"; this.Text = "Here come the time"; this.ResumeLayout(false); } #endregion /// <summary> /// The main entry point for the application. /// </summary> [STAThread] static void Main() { Application.Run(new Form1()); } private int a = 0; private void someMethod() { a=a+a; } private void button1_Click(object sender, System.EventArgs e) { int a = GetTickCount(); for (int x = 0; x < 1000000000; x++) someMethod(); a = GetTickCount() - a; Text = a.ToString(); } } }
Robert Jordan - 11 Oct 2004 18:38 GMT Hi,
I just tested John's sample and have acountered the following problem: MarshalByRefObjects members are not beeing optimized by the JIT like other objects. In the following sample "someMethod()" will not be inlined when the class is a MarshalByRefObject.
Is this a documented "feature"??
bye Rob
// // compile csc /o+ /debug- bench.cs // using System; using System.Diagnostics;
// use this to get the optimized version: class App { class App : MarshalByRefObject { int a = 0;
private void someMethod() { a = a + a; }
static void Main() { App app = new App(); DateTime t = DateTime.Now; for (int x = 0; x < 1000000000; x++) app.someMethod(); Console.WriteLine(DateTime.Now - t); } }
Robert Jordan - 11 Oct 2004 18:47 GMT > Hi, > [quoted text clipped - 4 lines] > > Is this a documented "feature"?? I found a blog about MBRs and optimizations:
http://blogs.msdn.com/cbrumme/archive/2003/07/14/51495.aspx
bye Rob
Stuart Carnie - 21 Oct 2004 01:27 GMT This has other interesting ramifications...
As all controls are derived from Control, they all directly inherit from MBR objects. Given that, no inlining of properties, etc are happening..
I have just done some tests to verify this..
e.g.
class MyForm : Form {
Int32 AbsScreenToScreen(Int32 lineNo) { return lineNo / _lineHeight }
Int32 ScreenColumn { get { return _currentPos * _lineHeight } set { _currentPos = value } } }
Neither of these methods are inlined..
However, if I do this:
class MyForm : Form {
private class myClass { public Int32 AbsScreenToScreen(Int32 lineNo) { return lineNo / _lineHeight }
public Int32 ScreenColumn { get { return _currentPos * _lineHeight } set { _currentPos = value } } } }
They now get inlined..
Interesting...
> > Hi, > > [quoted text clipped - 11 lines] > bye > Rob Stuart Carnie - 21 Oct 2004 01:39 GMT ..and another observation I saw was accessing properties that in turn returned the result of a property were only partially inlined:
struct Rectangle { private Int32 x;
public Int32 X { return this.x <- the x member variable }
public Int32 Left { get { return this.X; } <- the X property } }
By accessing Left, the JIT'd code was:
// Int32 x = myRect.X; mov ebx,dword ptr [esp+0Ch] // loads value of myRect.x
// Int32 Left = myRect.Left call 77838904 (and then in the routine) 00000000 mov eax,dword ptr [ecx] // loads the value of myRect.x 00000002 ret
> This has other interesting ramifications... > [quoted text clipped - 61 lines] > > bye > > Rob Robert Jordan - 21 Oct 2004 08:45 GMT > This has other interesting ramifications... > > As all controls are derived from Control, they all directly inherit from MBR > objects. Given that, no inlining of properties, etc are happening.. I knew that Form (the place where the OP has observed the non-inlining) is a MBR. That's why I changed the subject of the thread.
It wasn't smart, because not everybody knows (or need to know) that Forms are MBR. Sorry for that.
bye Rob
Stuart Carnie - 21 Oct 2004 18:12 GMT I knew that too :) I was just taking it further, with some experimentation and results of my own..
Normally, I wouldn't expect heavy processing to take place in Form methods, but we should certainly be aware of it..
Cheers,
Stu
> > This has other interesting ramifications... > > [quoted text clipped - 9 lines] > bye > Rob Stuart Carnie - 19 Oct 2004 22:24 GMT So, based on that, use a nested class
[DllImport("kernel32.dll")] public static extern int GetTickCount();
private class myClass { static int a = 0;
public static void someMethod() { a = a + a; } }
private void button1_Click(object sender, System.EventArgs e) { int a = GetTickCount(); for (int x = 0; x < 1000000000; x++) myClass.someMethod(); a = GetTickCount() - a; Text = a.ToString(); }
Runs in 1.9 seconds on my machine - previously 15.5 seconds..
Cheers,
Stu Just installed .NET Framework 1.1 and VS.NET 2003. Wrote a simple program:
[DllImport("kernel32.dll")] public static extern int GetTickCount();
int a = 0;
private void someMethod() { a = a + a; }
private void button1_Click(object sender, System.EventArgs e) { int a = GetTickCount(); for (int x = 0; x < 1000000000; x++) someMethod(); a = GetTickCount() - a; Text = a.ToString(); }
I have AMD Athlon XP 2600+ (1.9GHz). This program compiled in Release more works 19 seconds. The same code in Delphi 5 works 3 seconds, 2 seconds in Java 1.4. What's wrong with .NET?
After analyzing the machine code produced by JIT compiler everything becomes clear. .NET has no register optimization!!! The loop variable x - is used via stack reference and not assigned to register variable! Here are the opcodes produced for loop:
0000002f mov ecx,esi 00000031 call dword ptr ds:[00975A00h] ; Calling someMethod 00000037 inc dword ptr [ebp-10h] ; x++ 0000003a cmp dword ptr [ebp-10h],3B9ACA00h ; if x < 1kkk 00000041 jl 0000002F ; jump
Method someMethod opcodes:
00000000 push ebp 00000001 mov ebp,esp 00000003 push eax 00000004 push esi 00000005 mov esi,ecx 00000007 mov eax,dword ptr [esi+000000E4h] 0000000d add dword ptr [esi+000000E4h],eax 00000013 nop 00000014 pop esi 00000015 mov esp,ebp 00000017 pop ebp 00000018 ret
I dont know why, but there is simply NO ANY KIND of OPTIMIZATION available in this code. Maybe I need to switch it on somehow? If so, why it is not on by default?
Does anybody know?
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|