Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / CLR / April 2007

Tip: Looking for answers? Try searching our database.

Converting MSIL to C# Source Code

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
gangelo - 26 Apr 2007 15:00 GMT
I am trying to get some direction/documentation on how to convert MSIL to C#
source code - like Reflector. Is there any documentation out there on this?
Barry Kelly - 26 Apr 2007 16:47 GMT
> I am trying to get some direction/documentation on how to convert MSIL to C#
> source code - like Reflector. Is there any documentation out there on this?

Google 'decompiling assembler to source code', and you'll get leads.

If you're familiar with basic compilation techniques, such as generating
particular code patterns for expression trees, you'll find that it's a
similar process but in reverse: look for patterns and construct trees.

For example, RPN code for simple expressions is almost trivially
translatable into an expression tree, and stack machines such as the CLR
and the JVM use RPN code for expressions. Making it concrete, consider
you have the following code:

 ldc.i4 10
 ldc.i4 2
 ldc.i4 16
 mul
 add

Now, interpret this symbolically, by adding leaves for push instructions
and combining leaves with nodes for binary operators:

 push (constant 10)
 -- stack: 10
 push (constant 2)
 -- stack: 2 10
 push (constant 16)
 -- stack: 16 2 10
 mul
 -- stack: (mul 16 2) 10
 add
 -- stack: (add (mul 16 2) 10)

And there you have the tree, in Lisp notation: (add (mul 16 2) 10). Do
an in-order traversal of that with simple formatting and you can get
infix notation. Being extra safe and adding parentheses everywhere, it
turns into:

 ((16) * (2)) + (10)

You can remove the redundant parentheses by taking precedence into
account during the traversal. Things are a little more complex for
boolean expressions using && and || because of shortcut evaluation; they
end up looking like nested if statements and the like.

Outside of expressions, at the statement level, you'll want to break up
the flow into basic blocks - i.e. contiguous stretches of instructions
that don't have any jump targets inside them, and don't have any jumps
out except at the end. That's a basic compilation technique too, you'll
find more info in compiler texts. You can turn the results of these two
operations (symbolically interpreting expressions, and basic block
analysis) directly into rather ugly code filled with jumps,
if-statements and expressions, but more work can get you further.

There a number of typical patterns that high-level structures such as
loops turn into. Consider 'while', it might end up looking like this:

 label_A:
 <evaluate condition>
 brfalse label_B
 // loop body
 br label_A
 label_B:

Or like this:

 br label_B
 label_A:
 // loop body
 label_B:
 <evaluate condition>
 brtrue label_A

But the basic pattern is clear: there's a backward edge in the flow
graph (basic blocks are nodes, and edges are intra-method branching
instructions), and there's either a conditional jump out of the loop in
the flow graph, or the backward jump is conditional. These things are
made slightly more complex by 'break' and 'continue', but hopefully you
can see the general idea.

If you tune a disassembler for a given compiler's output, such as the MS
C# compiler, you can be a bit more cheeky and try to match against
specific code patterns. I suspect that technique would end up having
more work in the end than a more general approach, though.

-- Barry

Signature

http://barrkel.blogspot.com/

Chris Mullins [MVP] - 26 Apr 2007 19:18 GMT
Just point reflector at reflector, and see how it does it! :)

Signature

Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins

>I am trying to get some direction/documentation on how to convert MSIL to
>C#
> source code - like Reflector. Is there any documentation out there on
> this?
Barry Kelly - 26 Apr 2007 20:25 GMT
> Just point reflector at reflector, and see how it does it! :)

Last time I did that, I found that Reflector is built as an assembly
stored as a resource inside the main Reflector EXE, and encrypted with a
trivial XOR key; and then the inner assembly itself was obfuscated.

Later versions use different techniques IIRC, with Win32 resources /
sections rather than .NET resources.

-- Barry

Signature

http://barrkel.blogspot.com/

Michael Nemtsev - 27 Apr 2007 15:06 GMT
Hello gangelo,

Just to add to other posters, read about "pattern matching" http://en.wikipedia.org/wiki/Pattern_matching
Because reflector uses this principle

---
WBR,  Michael  Nemtsev [.NET/C# MVP].  
My blog: http://spaces.live.com/laflour
Team blog: http://devkids.blogspot.com/

"The greatest danger for most of us is not that our aim is too high and we
miss it, but that it is too low and we reach it" (c) Michelangelo

g> I am trying to get some direction/documentation on how to convert
g> MSIL to C# source code - like Reflector. Is there any documentation
g> out there on this?
g>

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.