.NET Forum / .NET Framework / CLR / March 2008
is there some instruction level overhead to dynamicMethod?
|
|
Thread rating:  |
psi - 11 Mar 2008 13:47 GMT i've tried to get a performance improvement by using dynamicMethod, but couldn't understand why it was so slow.
so, as a test, i copied the il from a decompiled method, then made a dynamicMethod with the same il in the same program and found the dynamicMethod about 10x slower, obviously no hope for improving speed, but can't see why this would be so.
BTW this is on the il instruction level, because when i made test programs with very large chunks of il, up to 10000 instructions, the speed ratio stayed the same.
Jon Skeet [C# MVP] - 11 Mar 2008 13:53 GMT > i've tried to get a performance improvement by using dynamicMethod, > but couldn't understand why it was so slow. [quoted text clipped - 7 lines] > programs with very large chunks of il, up to 10000 instructions, the > speed ratio stayed the same. What exactly are you benchmarking - just the *execution* of the dynamic method, or including building it and first JIT?
Do you have a benchmark program we could check out?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
psi - 11 Mar 2008 14:23 GMT > What exactly are you benchmarking - just the *execution* of the dynamic > method, or including building it and first JIT? just the invoke, and not the first one.
> Do you have a benchmark program we could check out? this is the output;
Done 10000000 in 5.5379632 secs - 1805718.03005119 ips. Done 10000000 in 0.7911376 secs - 12640026.2103583 ips.
from this, compiled to a console executable:
// Microsoft (R) .NET Framework IL Disassembler. Version 2.0.50727.42 // Copyright (c) Microsoft Corporation. All rights reserved.
// Metadata version: v2.0.50727 .assembly extern mscorlib { .publickeytoken = (B7 7A 5C 56 19 34 E0 89 ) // .z\V.4.. .ver 2:0:0:0 } .assembly test13 {
// --- The following custom attribute is added automatically, do not uncomment ------- // .custom instance void [mscorlib]System.Diagnostics.DebuggableAttribute::.ctor(bool, // bool) = ( 01 00 01 01 00 00 )
.custom instance void [mscorlib]System.Runtime.CompilerServices.RuntimeCompatibilityAttribute::.ctor() = ( 01 00 01 00 54 02 16 57 72 61 70 4E 6F 6E 45 78 // ....T..WrapNonEx 63 65 70 74 69 6F 6E 54 68 72 6F 77 73 01 ) // ceptionThrows. .hash algorithm 0x00008004 .ver 0:0:0:0 } .module test13 // MVID: {20EE0608-B0F9-445F-92CB-50A5CE6C9AD6} .imagebase 0x00400000 .file alignment 0x00000200 .stackreserve 0x00100000 .subsystem 0x0003 // WINDOWS_CUI .corflags 0x00000001 // ILONLY // Image base: 0x010D0000
// =============== CLASS MEMBERS DECLARATION ===================
.class public auto ansi serializable beforefieldinit ilplay.dynamicMethodOwner extends [mscorlib]System.Object { .method public hidebysig specialname rtspecialname instance void .ctor() cil managed { // Code size 7 (0x7) .maxstack 2 IL_0000: ldarg.0 IL_0001: call instance void [mscorlib]System.Object::.ctor() IL_0006: ret } // end of method dynamicMethodOwner::.ctor
} // end of class ilplay.dynamicMethodOwner
.class public auto ansi sealed beforefieldinit ilplay.Test13Module extends [mscorlib]System.Object { .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGlobalScopeAttribute::.ctor() = ( 01 00 00 00 ) .method private hidebysig static void Main(string[] argv) cil managed { .entrypoint // Code size 999 (0x3e7) .maxstack 23 .locals init ([0] int32 count, [1] class [mscorlib]System.Reflection.Emit.DynamicMethod meb, [2] class [mscorlib]System.Reflection.Emit.ILGenerator 'il', [3] int32 i, [4] int32[] a, [5] int32 count2, [6] valuetype [mscorlib]System.DateTime moment1, [7] int32 j, [8] valuetype [mscorlib]System.DateTime moment2, [9] valuetype [mscorlib]System.TimeSpan elapsed, [10] int32 temp1, [11] int32 temp2, [12] int32 temp3, [13] int32 temp4, [14] int32 temp5, [15] int32 temp0, [16] int32 V_16, [17] int32 ___temp1, [18] int32 ___temp2, [19] int32 ___temp3, [20] int32 ___temp4, [21] int32 ___temp5, [22] int32 ___temp6) IL_0000: ldc.i4 0x3e8 IL_0005: stloc.0 IL_0006: ldstr "test" IL_000b: ldnull IL_000c: ldc.i4 0x1 IL_0011: newarr [mscorlib]System.Type IL_0016: dup IL_0017: ldc.i4 0x0 IL_001c: ldtoken [mscorlib]System.Array IL_0021: call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle) IL_0026: stelem.ref IL_0027: ldtoken ilplay.dynamicMethodOwner IL_002c: call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle) IL_0031: ldc.i4.1 IL_0032: newobj instance void [mscorlib]System.Reflection.Emit.DynamicMethod::.ctor(string, class [mscorlib]System.Type, class [mscorlib]System.Type[], class [mscorlib]System.Type, bool) IL_0037: stloc.1 IL_0038: ldloc.1 IL_0039: ldloc.0 IL_003a: ldc.i4 0x1e IL_003f: mul.ovf IL_0040: callvirt instance class [mscorlib]System.Reflection.Emit.ILGenerator [mscorlib]System.Reflection.Emit.DynamicMethod::GetILGenerator(int32) IL_0045: stloc.2 IL_0046: ldc.i4.0 IL_0047: stloc.s ___temp1 IL_0049: ldloc.0 IL_004a: stloc.s ___temp2 IL_004c: ldloc.s ___temp2 IL_004e: ldc.i4.0 IL_004f: clt IL_0051: brfalse IL_0061
IL_0056: ldstr "max" IL_005b: newobj instance void [mscorlib]System.ArgumentOutOfRangeException::.ctor(string) IL_0060: throw
IL_0061: br IL_01a3
IL_0066: ldloc.s ___temp1 IL_0068: stloc.3 IL_0069: ldloc.s ___temp1 IL_006b: ldc.i4.1 IL_006c: add IL_006d: stloc.s ___temp1 IL_006f: ldloc.2 IL_0070: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_0075: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_007a: ldloc.2 IL_007b: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_0 IL_0080: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0085: ldloc.2 IL_0086: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_008b: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0090: ldloc.2 IL_0091: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_0 IL_0096: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_009b: ldloc.2 IL_009c: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldelem_I4 IL_00a1: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00a6: ldloc.2 IL_00a7: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_00ac: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00b1: ldloc.2 IL_00b2: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_1 IL_00b7: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00bc: ldloc.2 IL_00bd: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldelem_I4 IL_00c2: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00c7: ldloc.2 IL_00c8: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_00cd: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00d2: ldloc.2 IL_00d3: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_2 IL_00d8: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00dd: ldloc.2 IL_00de: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldelem_I4 IL_00e3: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00e8: ldloc.2 IL_00e9: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_00ee: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00f3: ldloc.2 IL_00f4: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_3 IL_00f9: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_00fe: ldloc.2 IL_00ff: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldelem_I4 IL_0104: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0109: ldloc.2 IL_010a: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_010f: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0114: ldloc.2 IL_0115: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_4 IL_011a: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_011f: ldloc.2 IL_0120: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldelem_I4 IL_0125: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_012a: ldloc.2 IL_012b: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldarg_0 IL_0130: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0135: ldloc.2 IL_0136: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_5 IL_013b: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0140: ldloc.2 IL_0141: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldelem_I4 IL_0146: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_014b: ldloc.2 IL_014c: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Add_Ovf_Un IL_0151: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0156: ldloc.2 IL_0157: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Add_Ovf_Un IL_015c: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0161: ldloc.2 IL_0162: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Add_Ovf_Un IL_0167: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_016c: ldloc.2 IL_016d: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Add_Ovf_Un IL_0172: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0177: ldloc.2 IL_0178: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Add_Ovf_Un IL_017d: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0182: ldloc.2 IL_0183: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ldc_I4_6 IL_0188: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_018d: ldloc.2 IL_018e: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Div_Un IL_0193: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_0198: ldloc.2 IL_0199: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Stelem_I4 IL_019e: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_01a3: ldloc.s ___temp1 IL_01a5: ldloc.s ___temp2 IL_01a7: blt IL_0066
IL_01ac: ldloc.2 IL_01ad: ldsfld valuetype [mscorlib]System.Reflection.Emit.OpCode [mscorlib]System.Reflection.Emit.OpCodes::Ret IL_01b2: callvirt instance void [mscorlib]System.Reflection.Emit.ILGenerator::Emit(valuetype [mscorlib]System.Reflection.Emit.OpCode) IL_01b7: ldc.i4 0x6 IL_01bc: newarr [mscorlib]System.Int32 IL_01c1: dup IL_01c2: ldc.i4 0x0 IL_01c7: ldc.i4.1 IL_01c8: stelem.i4 IL_01c9: dup IL_01ca: ldc.i4 0x1 IL_01cf: ldc.i4.2 IL_01d0: stelem.i4 IL_01d1: dup IL_01d2: ldc.i4 0x2 IL_01d7: ldc.i4.3 IL_01d8: stelem.i4 IL_01d9: dup IL_01da: ldc.i4 0x3 IL_01df: ldc.i4.4 IL_01e0: stelem.i4 IL_01e1: dup IL_01e2: ldc.i4 0x4 IL_01e7: ldc.i4.5 IL_01e8: stelem.i4 IL_01e9: dup IL_01ea: ldc.i4 0x5 IL_01ef: ldc.i4.6 IL_01f0: stelem.i4 IL_01f1: stloc.s a IL_01f3: ldc.i4 0x2710 IL_01f8: stloc.s count2 IL_01fa: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now() IL_01ff: stloc.s moment1 IL_0201: ldloc.1 IL_0202: ldnull IL_0203: ldc.i4 0x1 IL_0208: newarr [mscorlib]System.Object IL_020d: dup IL_020e: ldc.i4 0x0 IL_0213: ldloc.s a IL_0215: stelem.ref IL_0216: callvirt instance object [mscorlib]System.Reflection.MethodBase::Invoke(object, object[]) IL_021b: pop IL_021c: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now() IL_0221: stloc.s moment1 IL_0223: ldc.i4.0 IL_0224: stloc.s ___temp3 IL_0226: ldloc.s count2 IL_0228: stloc.s ___temp4 IL_022a: ldloc.s ___temp4 IL_022c: ldc.i4.0 IL_022d: clt IL_022f: brfalse IL_023f
IL_0234: ldstr "max" IL_0239: newobj instance void [mscorlib]System.ArgumentOutOfRangeException::.ctor(string) IL_023e: throw
IL_023f: br IL_0269
IL_0244: ldloc.s ___temp3 IL_0246: stloc.s j IL_0248: ldloc.s ___temp3 IL_024a: ldc.i4.1 IL_024b: add IL_024c: stloc.s ___temp3 IL_024e: ldloc.1 IL_024f: ldnull IL_0250: ldc.i4 0x1 IL_0255: newarr [mscorlib]System.Object IL_025a: dup IL_025b: ldc.i4 0x0 IL_0260: ldloc.s a IL_0262: stelem.ref IL_0263: callvirt instance object [mscorlib]System.Reflection.MethodBase::Invoke(object, object[]) IL_0268: pop IL_0269: ldloc.s ___temp3 IL_026b: ldloc.s ___temp4 IL_026d: blt IL_0244
IL_0272: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now() IL_0277: stloc.s moment2 IL_0279: ldloc.s moment2 IL_027b: ldloc.s moment1 IL_027d: call valuetype [mscorlib]System.TimeSpan [mscorlib]System.DateTime::op_Subtraction(valuetype [mscorlib]System.DateTime, valuetype [mscorlib]System.DateTime) IL_0282: stloc.s elapsed IL_0284: ldstr "Done " IL_0289: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor(string) IL_028e: ldloc.0 IL_028f: ldloc.s count2 IL_0291: mul.ovf IL_0292: box [mscorlib]System.Int32 IL_0297: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(object) IL_029c: ldstr " in " IL_02a1: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string) IL_02a6: ldloca.s elapsed IL_02a8: call instance float64 [mscorlib]System.TimeSpan::get_TotalSeconds() IL_02ad: box [mscorlib]System.Double IL_02b2: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(object) IL_02b7: ldstr " secs - " IL_02bc: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string) IL_02c1: ldloc.0 IL_02c2: ldloc.s count2 IL_02c4: mul.ovf IL_02c5: conv.r8 IL_02c6: ldloca.s elapsed IL_02c8: call instance float64 [mscorlib]System.TimeSpan::get_TotalSeconds() IL_02cd: div IL_02ce: box [mscorlib]System.Double IL_02d3: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(object) IL_02d8: ldstr " ips." IL_02dd: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string) IL_02e2: call instance string [mscorlib]System.Text.StringBuilder::ToString() IL_02e7: call void [mscorlib]System.Console::WriteLine(string) IL_02ec: ldc.i4 0x2710 IL_02f1: stloc.s count2 IL_02f3: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now() IL_02f8: stloc.s moment1 IL_02fa: ldc.i4.1 IL_02fb: stloc.s temp1 IL_02fd: ldc.i4.2 IL_02fe: stloc.s temp2 IL_0300: ldc.i4.3 IL_0301: stloc.s temp3 IL_0303: ldc.i4.4 IL_0304: stloc.s temp4 IL_0306: ldc.i4.5 IL_0307: stloc.s temp5 IL_0309: ldc.i4.0 IL_030a: stloc.s temp0 IL_030c: ldc.i4.0 IL_030d: stloc.s ___temp5 IL_030f: ldloc.0 IL_0310: ldloc.s count2 IL_0312: mul.ovf IL_0313: stloc.s ___temp6 IL_0315: ldloc.s ___temp6 IL_0317: ldc.i4.0 IL_0318: clt IL_031a: brfalse IL_032a
IL_031f: ldstr "max" IL_0324: newobj instance void [mscorlib]System.ArgumentOutOfRangeException::.ctor(string) IL_0329: throw
IL_032a: br IL_0363
IL_032f: ldloc.s ___temp5 IL_0331: stloc.s V_16 IL_0333: ldloc.s ___temp5 IL_0335: ldc.i4.1 IL_0336: add IL_0337: stloc.s ___temp5 IL_0339: ldloc.s a IL_033b: ldloc.s temp0 IL_033d: ldloc.s a IL_033f: ldloc.s temp0 IL_0341: ldelem.i4 IL_0342: ldloc.s a IL_0344: ldloc.s temp1 IL_0346: ldelem.i4 IL_0347: add.ovf IL_0348: ldloc.s a IL_034a: ldloc.s temp2 IL_034c: ldelem.i4 IL_034d: add.ovf IL_034e: ldloc.s a IL_0350: ldloc.s temp3 IL_0352: ldelem.i4 IL_0353: add.ovf IL_0354: ldloc.s a IL_0356: ldloc.s temp4 IL_0358: ldelem.i4 IL_0359: add.ovf IL_035a: ldloc.s a IL_035c: ldloc.s temp5 IL_035e: ldelem.i4 IL_035f: add.ovf IL_0360: ldc.i4.6 IL_0361: div IL_0362: stelem.i4 IL_0363: ldloc.s ___temp5 IL_0365: ldloc.s ___temp6 IL_0367: blt IL_032f
IL_036c: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now() IL_0371: stloc.s moment2 IL_0373: ldloc.s moment2 IL_0375: ldloc.s moment1 IL_0377: call valuetype [mscorlib]System.TimeSpan [mscorlib]System.DateTime::op_Subtraction(valuetype [mscorlib]System.DateTime, valuetype [mscorlib]System.DateTime) IL_037c: stloc.s elapsed IL_037e: ldstr "Done " IL_0383: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor(string) IL_0388: ldloc.0 IL_0389: ldloc.s count2 IL_038b: mul.ovf IL_038c: box [mscorlib]System.Int32 IL_0391: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(object) IL_0396: ldstr " in " IL_039b: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string) IL_03a0: ldloca.s elapsed IL_03a2: call instance float64 [mscorlib]System.TimeSpan::get_TotalSeconds() IL_03a7: box [mscorlib]System.Double IL_03ac: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(object) IL_03b1: ldstr " secs - " IL_03b6: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string) IL_03bb: ldloc.0 IL_03bc: ldloc.s count2 IL_03be: mul.ovf IL_03bf: conv.r8 IL_03c0: ldloca.s elapsed IL_03c2: call instance float64 [mscorlib]System.TimeSpan::get_TotalSeconds() IL_03c7: div IL_03c8: box [mscorlib]System.Double IL_03cd: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(object) IL_03d2: ldstr " ips." IL_03d7: call instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string) IL_03dc: call instance string [mscorlib]System.Text.StringBuilder::ToString() IL_03e1: call void [mscorlib]System.Console::WriteLine(string) IL_03e6: ret } // end of method Test13Module::Main
.method private hidebysig specialname rtspecialname instance void .ctor() cil managed { // Code size 7 (0x7) .maxstack 2 IL_0000: ldarg.0 IL_0001: call instance void [mscorlib]System.Object::.ctor() IL_0006: ret } // end of method Test13Module::.ctor
} // end of class ilplay.Test13Module
// =============================================================
// *********** DISASSEMBLY COMPLETE ***********************
Jon Skeet [C# MVP] - 11 Mar 2008 14:40 GMT > > What exactly are you benchmarking - just the *execution* of the dynamic > > method, or including building it and first JIT? [quoted text clipped - 9 lines] > > from this, compiled to a console executable: Do you have it in a C# form which is slightly easier to play around with?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
psi - 11 Mar 2008 14:55 GMT > Do you have it in a C# form which is slightly easier to play around > with? sorry no c, but here's a Boo version, that's basically the same,
namespace ilplay class dynamicMethodOwner(): pass
count=1000 meb=System.Reflection.Emit.DynamicMethod("test",null, (typeof(System.Array),),ilplay.dynamicMethodOwner,true) il=meb.GetILGenerator(count*30)
for i in range(count):
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_0)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_0) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_1) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_2) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_3) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_4) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_5) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un)
il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_6) il.Emit(System.Reflection.Emit.OpCodes.Div_Un) il.Emit(System.Reflection.Emit.OpCodes.Stelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ret)
a=array(System.Int32,(1,2,3,4,5,6))
count2=10000 moment1=date.Now meb.Invoke(null,array(System.Object,(a,))) moment1=date.Now for j in range(count2): meb.Invoke(null,array(System.Object,(a,))) moment2=date.Now
elapsed = (moment2-moment1)
print("Done ${count*count2} in ${elapsed.TotalSeconds} secs - $ {count*count2/elapsed.TotalSeconds} ips.")
count2=10000 moment1=date.Now temp1=1 temp2=2 temp3=3 temp4=4 temp5=5 temp0=0 for j in range(count*count2): rawArrayIndexing: a[temp0]=(a[temp0]+a[temp1]+a[temp2]+a[temp3]+a[temp4]+a[temp5])/6
moment2=date.Now
elapsed = (moment2-moment1)
print("Done ${count*count2} in ${elapsed.TotalSeconds} secs - $ {count*count2/elapsed.TotalSeconds} ips.")
Jon Skeet [C# MVP] - 11 Mar 2008 15:06 GMT > > Do you have it in a C# form which is slightly easier to play around > > with? > > sorry no c, but here's a Boo version, that's basically the same, <snip>
Okay, I think I see the problem.
I suspect that your code is being JITted on every iteration, because you're using Invoke.
Call CreateDelegate once instead, and then repeatedly call the delegate.
I don't guarantee that that'll do it - I'm far from an expert on this area - but it's definitely worth a try.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Barry Kelly - 13 Mar 2008 14:25 GMT > count=1000 > meb=System.Reflection.Emit.DynamicMethod("test",null, > (typeof(System.Array),),ilplay.dynamicMethodOwner,true) You should consider using a more specific type than this, such as typeof(int[]) (I don't know the Boo syntax).
> il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) > il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) > il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) > il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) > il.Emit(System.Reflection.Emit.OpCodes.Add_Ovf_Un) Do be aware that you are generating 5000 overflow checks here. C#, for example, does *not* default to using overflow checks - you have to use 'checked' keyword; for that reason, it is less likely to be highly optimized than non-overflow-checked add. OpCodes.Add would do for an unsigned add without overflow check.
> meb.Invoke(null,array(System.Object,(a,))) Like Jon says, to avoid reflection overhead, you need to create a delegate and call through that.
I was curious enough to write up a C# translation, and the times for the DynamicMethod are roughly twice those for the C# version. I don't really have time to figure out why right now, though...
-- Barry
 Signature http://barrkel.blogspot.com/
psi - 17 Mar 2008 01:14 GMT thanks for the replies, i've now done some more work on this,
> You should consider using a more specific type than this, such as > typeof(int[]) (I don't know the Boo syntax). > Like Jon says, to avoid reflection overhead, you need to create a > delegate and call through that. see below, using delegate and not overflow checking.
removing overflow checking resulted in it now only being 5x slower, delegate made no difference, don't forget the large number of op-codes make setup overhead etc. pretty insignificant
> I was curious enough to write up a C# translation, and the times for the > DynamicMethod are roughly twice those for the C# version. I don't really > have time to figure out why right now, though... did you replace the op-codes with those generated by c# compiler?
the Boo compiler op-codes seem pretty optimal, very hard to see where a 5x drop in speed comes from, so also hard to see how c# is only 2x faster, could the c# compiler be less efficient than the Boo compiler?.
BTW what i'm attempting is to use the dynamicmethod to remove the looping overhead by inlining the whole thing, i have a use where the loop count only rarely changes, so basically this is another level of JIT.
psi
Boo program output;
Done 10000000 in 3.7353712 secs - 2677110.10889627 ips. Done 10000000 in 0.7811232 secs - 12802077.8284399 ips.
program;
namespace ilplay class dynamicMethodOwner(): pass
callable nullArray(p1 as (int))
count=1000 meb=System.Reflection.Emit.DynamicMethod("",null, (typeof((int)),),ilplay.dynamicMethodOwner,true) il=meb.GetILGenerator(count*30)
for i in range(count):
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_0)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_0) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_1) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_2) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_3) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_4) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0) il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_5) il.Emit(System.Reflection.Emit.OpCodes.Ldelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Add) il.Emit(System.Reflection.Emit.OpCodes.Add) il.Emit(System.Reflection.Emit.OpCodes.Add) il.Emit(System.Reflection.Emit.OpCodes.Add) il.Emit(System.Reflection.Emit.OpCodes.Add)
il.Emit(System.Reflection.Emit.OpCodes.Ldc_I4_6) il.Emit(System.Reflection.Emit.OpCodes.Div_Un) il.Emit(System.Reflection.Emit.OpCodes.Stelem_I4)
il.Emit(System.Reflection.Emit.OpCodes.Ret)
a=array(System.Int32,(1,2,3,4,5,6))
count2=10000 moment1=date.Now
myd = meb.CreateDelegate(typeof(nullArray))
myd.DynamicInvoke(a) moment1=date.Now for j in range(count2): myd.DynamicInvoke(a) moment2=date.Now
elapsed = (moment2-moment1)
print("Done ${count*count2} in ${elapsed.TotalSeconds} secs - $ {count*count2/elapsed.TotalSeconds} ips.")
count2=10000 moment1=date.Now temp1=1 temp2=2 temp3=3 temp4=4 temp5=5 temp0=0 for j in range(count*count2): rawArrayIndexing: a[temp0]=(a[temp0]+a[temp1]+a[temp2]+a[temp3]+a[temp4]+a[temp5])/6 moment2=date.Now
elapsed = (moment2-moment1)
print("Done ${count*count2} in ${elapsed.TotalSeconds} secs - $ {count*count2/elapsed.TotalSeconds} ips.")
Jon Skeet [C# MVP] - 17 Mar 2008 01:42 GMT > > You should consider using a more specific type than this, such as > > typeof(int[]) (I don't know the Boo syntax). > > Like Jon says, to avoid reflection overhead, you need to create a > > delegate and call through that. > > see below, using delegate and not overflow checking. I see you're calling DynamicInvoke though - why not cast to the appropriate delegate type and call it more directly?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Barry Kelly - 17 Mar 2008 09:18 GMT > myd = meb.CreateDelegate(typeof(nullArray)) > > myd.DynamicInvoke(a) You should be calling the delegate directly, not through reflection (DynamicInvoke is a reflection-based invocation), roughly like in the following C#:
---8<--- using System; using System.Reflection.Emit; using System.Diagnostics;
namespace EmitBench { class Program { const int count = 10000; const int runIters = 1000; const int csRunIters = count * runIters; static void Main() { Stopwatch w = new Stopwatch(); double elapsed; DynamicMethod m = new DynamicMethod("test", typeof(void), new[] { typeof(int[]) }); ILGenerator g = m.GetILGenerator(); for (int i = 0; i < count; ++i) { g.Emit(OpCodes.Ldarg_0); g.Emit(OpCodes.Ldc_I4_0); for (int j = 0; j < 6; ++j) { g.Emit(OpCodes.Ldarg_0); g.Emit(OpCodes.Ldc_I4, j); g.Emit(OpCodes.Ldelem_I4); } for (int j = 0; j < 5; ++j) g.Emit(OpCodes.Add); g.Emit(OpCodes.Ldc_I4_6); g.Emit(OpCodes.Div); g.Emit(OpCodes.Stelem_I4); } g.Emit(OpCodes.Ret); var args = new[] { 1, 2, 3, 4, 5, 6 }; // Exclude first-time JIT Action<int[]> f = (Action<int[]>) m.CreateDelegate(typeof(Action<int[]>)); f(args); f(args); w.Reset(); w.Start(); for (int i = 0; i < runIters; ++i) f(args); elapsed = w.ElapsedTicks / (double) Stopwatch.Frequency; Console.WriteLine("DM took {0:F3} seconds ({1}).", elapsed, args[0]); // Compare using C# checked // necessary to get Add_Ovf_Un { // Reset values uint[] uargs = new uint[] { 1, 2, 3, 4, 5, 6 }; w.Reset(); w.Start(); for (int i = 0; i < csRunIters; ++i) uargs[0] = (uargs[0] + uargs[1] + uargs[2] + uargs[3] + uargs[4] + uargs[5]) / 6; elapsed = w.ElapsedTicks / (double) Stopwatch.Frequency; Console.WriteLine("C# took {0:F3} seconds ({1}).", elapsed, uargs[0]); } } } } --->8---
-- Barry
 Signature http://barrkel.blogspot.com/
psi - 17 Mar 2008 22:04 GMT > > myd = meb.CreateDelegate(typeof(nullArray)) > [quoted text clipped - 3 lines] > (DynamicInvoke is a reflection-based invocation), roughly like in the > following C#: OK.
myd as nullArray = meb.CreateDelegate(typeof(nullArray))
myd(a)
Done 10000000 in 2.9141904 secs - 3431484.77875708 ips. Done 10000000 in 0.7811232 secs - 12802077.8284399 ips.
so we're down to 4x slower, the problem is that if dynamicmethod runs op-codes this slowly, isn't it mostly useless?
i've also noticed that the loop time drops, as expected, as the loop count increases, divides the method level overhead, but only up too about 200 cycles, where its only twice as slow, it then increases to the 4x at 1000 cycles, so something is happening that increases the individual op-code overhead with longer op-code runs?
psi - 17 Mar 2008 22:06 GMT > removing overflow checking resulted in it now only being 5x slower, > delegate made no difference, don't forget the large number of op-codes > make setup overhead etc. pretty insignificant BTW, on mono the conversion to non-overflow checked makes much less difference than it does on ms .net.
psi - 18 Mar 2008 18:03 GMT discovered one more thing;
with the count reduced to 25, im now getting a speed UP, about 15% on ms.net and 30% on mono, which was what i was originally expecting, so roughly speaking the problem starts above about 1k op-codes, and is over linear on the op-code count.
which doesn't help me but does mean the dynamicmethod isn't useless for most simple methods, and maybe this is a framework issue, not a dynamicmethod one, i'll write some long il method directly and see what happens.
Ben Voigt [C++ MVP] - 18 Mar 2008 18:47 GMT > discovered one more thing; > > with the count reduced to 25, im now getting a speed UP, about 15% on > ms.net and 30% on mono, which was what i was originally expecting, so > roughly speaking the problem starts above about 1k op-codes, and is > over linear on the op-code count. Well, optimization is a tricky business.
It doesn't make sense to unroll loops by that great a factor. The savings in terms of avoiding the loop test diminish, while your code size continues to grow. You're seeing the effects of exceeding the L1 instruction cache. When that happens, the performance hit you take costs far more than the savings from unrolling.
This is why the current wisdom is that "optimize for space" flags perform better than "optimize for speed". Actually some unrolling and inlining is helpful, but aim for a unrolling factor of 20 or less (find some number that evenly divides your total iteration count).
> which doesn't help me but does mean the dynamicmethod isn't useless > for most simple methods, and maybe this is a framework issue, not a > dynamicmethod one, i'll write some long il method directly and see > what happens. psi - 20 Mar 2008 17:56 GMT > > discovered one more thing; > [quoted text clipped - 7 lines] > It doesn't make sense to unroll loops by that great a factor. The savings > in terms of avoiding the loop test diminish, while your code size continues in this case its not just the loop test, the array addresses will depend on loop position, so it saves much more that way.
> to grow. You're seeing the effects of exceeding the L1 instruction cache. > When that happens, the performance hit you take costs far more than the [quoted text clipped - 4 lines] > helpful, but aim for a unrolling factor of 20 or less (find some number that > evenly divides your total iteration count). thanks, that makes sense, of course the optimum unrolling factor will depend on the actual loop length, and also the size of L1, so then you would want to code for the smallest or take account of the actual size somehow, which is hitting 'diminishing returns' i think.
strange thought that dynamicmethod seems to hit this much earlier.
psi - 11 Mar 2008 14:41 GMT really should say what this does,
makes a dynamicmethod that takes an array, and adds together the first six itesm and stores into first the result, repeats this 1000 times in ONE big block of il, runs the method once to ignore setup, then times running it 10000 times, then times running basically the same il 10000000 times. this has been done for various cycle counts to get the same rates.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|