> I am trying to get some direction/documentation on how to convert MSIL to C#
> source code - like Reflector. Is there any documentation out there on this?
Google 'decompiling assembler to source code', and you'll get leads.
If you're familiar with basic compilation techniques, such as generating
particular code patterns for expression trees, you'll find that it's a
similar process but in reverse: look for patterns and construct trees.
For example, RPN code for simple expressions is almost trivially
translatable into an expression tree, and stack machines such as the CLR
and the JVM use RPN code for expressions. Making it concrete, consider
you have the following code:
ldc.i4 10
ldc.i4 2
ldc.i4 16
mul
add
Now, interpret this symbolically, by adding leaves for push instructions
and combining leaves with nodes for binary operators:
push (constant 10)
-- stack: 10
push (constant 2)
-- stack: 2 10
push (constant 16)
-- stack: 16 2 10
mul
-- stack: (mul 16 2) 10
add
-- stack: (add (mul 16 2) 10)
And there you have the tree, in Lisp notation: (add (mul 16 2) 10). Do
an in-order traversal of that with simple formatting and you can get
infix notation. Being extra safe and adding parentheses everywhere, it
turns into:
((16) * (2)) + (10)
You can remove the redundant parentheses by taking precedence into
account during the traversal. Things are a little more complex for
boolean expressions using && and || because of shortcut evaluation; they
end up looking like nested if statements and the like.
Outside of expressions, at the statement level, you'll want to break up
the flow into basic blocks - i.e. contiguous stretches of instructions
that don't have any jump targets inside them, and don't have any jumps
out except at the end. That's a basic compilation technique too, you'll
find more info in compiler texts. You can turn the results of these two
operations (symbolically interpreting expressions, and basic block
analysis) directly into rather ugly code filled with jumps,
if-statements and expressions, but more work can get you further.
There a number of typical patterns that high-level structures such as
loops turn into. Consider 'while', it might end up looking like this:
label_A:
<evaluate condition>
brfalse label_B
// loop body
br label_A
label_B:
Or like this:
br label_B
label_A:
// loop body
label_B:
<evaluate condition>
brtrue label_A
But the basic pattern is clear: there's a backward edge in the flow
graph (basic blocks are nodes, and edges are intra-method branching
instructions), and there's either a conditional jump out of the loop in
the flow graph, or the backward jump is conditional. These things are
made slightly more complex by 'break' and 'continue', but hopefully you
can see the general idea.
If you tune a disassembler for a given compiler's output, such as the MS
C# compiler, you can be a bit more cheeky and try to match against
specific code patterns. I suspect that technique would end up having
more work in the end than a more general approach, though.
-- Barry

Signature
http://barrkel.blogspot.com/
Just point reflector at reflector, and see how it does it! :)

Signature
Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins
>I am trying to get some direction/documentation on how to convert MSIL to
>C#
> source code - like Reflector. Is there any documentation out there on
> this?
Barry Kelly - 26 Apr 2007 20:25 GMT
> Just point reflector at reflector, and see how it does it! :)
Last time I did that, I found that Reflector is built as an assembly
stored as a resource inside the main Reflector EXE, and encrypted with a
trivial XOR key; and then the inner assembly itself was obfuscated.
Later versions use different techniques IIRC, with Win32 resources /
sections rather than .NET resources.
-- Barry

Signature
http://barrkel.blogspot.com/
Hello gangelo,
Just to add to other posters, read about "pattern matching" http://en.wikipedia.org/wiki/Pattern_matching
Because reflector uses this principle
---
WBR, Michael Nemtsev [.NET/C# MVP].
My blog: http://spaces.live.com/laflour
Team blog: http://devkids.blogspot.com/
"The greatest danger for most of us is not that our aim is too high and we
miss it, but that it is too low and we reach it" (c) Michelangelo
g> I am trying to get some direction/documentation on how to convert
g> MSIL to C# source code - like Reflector. Is there any documentation
g> out there on this?
g>