> hi,
>
> Iv'e seen two ways of copying streams in .net:
>
> Stream outStream = MemoryStream();
> Stream inStream = CreateSomeMemoryStream();
I assume your're not always reading and writing to memory-streams.
> byte[] buffer = new byte[inStream.Length];
> inStream.Read(buffer, 0, buffer.Length);
> outStream.Write(buffer, 0, buffer.Length);
That's just plain horrible, it only works on streams with known length,
and it isn't ensured to read the entire in-stream (in the general case
streams may not return as much data as asked for, even it is available).
> and the second one was more or less like this:
>
> Stream outStream = MemoryStream();
> Stream inStream = CreateSomeMemoryStream();
I still don't understand these memory-streams..., do you really *need* a
memory-copy of the content of inStream?
> int size = 2048;
> byte[] buffer = new byte[size];
[quoted text clipped - 6 lines]
> break;
> }
This is quite a lot better. I prefer a do...while loop though.
> i wonder if there are any guidelines on this issue? should i read in a
> loop or can i read at once and the CLR will take care of everything?
You *should* read in a loop, since Read may not return the entire
content of the stream.
Here is how I usually do it:
using System.IO;
class StreamUtil
{
public static void CopyStream(
Stream input, Stream output,
byte[] buffer, int offset, int count)
{
int read;
do
{
read = input.Read(buffer, offset, count);
if ( read > 0 )
output.Write(buffer, offset, read);
} while ( read > 0 );
}
public static void CopyStream(Stream input, Stream output, byte[] buffer)
{ CopyStream(input, output, buffer, 0, buffer.Length); }
public static void CopyStream(Stream input, Stream output, int bufferSize)
{ CopyStream(input, output, new byte[bufferSize]); }
public static void CopyStream(Stream input, Stream output)
{ CopyStream(input, output, 8*1024); }
// ... more utils here
}
class Foo
{
void DoingStuff() {
// ...
Stream input; // from somewhere
Stream output; // from somewhere
StreamUtil.CopyStream(input, output);
// ...
}
}

Signature
Helge Jensen
mailto:helge.jensen@slog.dk
sip:helge.jensen@slog.dk
-=> Sebastian cover-music: http://ungdomshus.nu <=-
SharpCoderMP - 25 Apr 2006 22:15 GMT
> I assume your're not always reading and writing to memory-streams.
yes, you're right.
> I still don't understand these memory-streams..., do you really *need* a
> memory-copy of the content of inStream?
this was rather for the purpose of this example. usually i work with
streams produced by some classes or created from files and i cannot
predict their origin. i usually end up with something like this:
Stream myStream = SomeThirdPartyClass.GetStream();
and then i need to do some processing before i can use this stream. the
processing does some major changes to the data inside the stream
changing also it's length. If i would like to then write myStream to
file i need to create new FileStream and copy the contents of the
myStream into that newly created FileStream.
> This is quite a lot better. I prefer a do...while loop though.
yeah i know.
>> i wonder if there are any guidelines on this issue? should i read in a
>> loop or can i read at once and the CLR will take care of everything?
>
> You *should* read in a loop, since Read may not return the entire
> content of the stream.
ok. thanks for the info.
> Here is how I usually do it:
>
[quoted text clipped - 31 lines]
> }
> }
Ben Voigt - 03 May 2006 16:21 GMT
>> hi,
>>
[quoted text clipped - 55 lines]
> if ( read > 0 )
> output.Write(buffer, offset, read);
You need to adjust offset and count by read...
offset += read;
count -= read;
> } while ( read > 0 );
while (count > 0)
> }
> public static void CopyStream(Stream input, Stream output, byte[] buffer)
[quoted text clipped - 16 lines]
> }
> }
Helge Jensen - 04 May 2006 10:58 GMT
> "Helge Jensen" <helge.jensen@slog.dk> wrote in message
>> using System.IO;
>> class StreamUtil
[quoted text clipped - 15 lines]
>
>> } while ( read > 0 );
The code should work. The intent of it is to read the *entire* input,
writing it to output:
The loop terminates when read <= 0: the input is empty. It reads
count-bytes at a time into buffer, at specified offset, and writes those
same bytes to output.
If I were however trying to fill a buffer from a stream.. which it seems
like you are suggesting, I would do something like:
public static int ReadFromStream(
Stream input, byte[] buffer, int offset, int count) {
int r;
for ( int i = 0; i < count; i += r ) {
r = input.Read(buffer, offset+i, count-i);
if ( r == 0 )
return i;
}
return count;
}
public static void FillFromSream(
Stream input, byte[] buffer, int offset, int count) {
int read = ReadFromStream(input, buffer, offset, count);
if ( read != count )
throw new EndOfStreamException(
string.format(
"Required {0} bytes, got {1} before EndOfStream",
count, read));
}
>i wonder if there are any guidelines on this issue? should i read in a
>loop or can i read at once and the CLR will take care of everything?
IMO you should use a loop. While it may not be needed when working
exclusively with MemoryStreams, it could be necessary for other type
of Streams.
That doesn't mean you have to read 2K at the time though. If the input
stream has a known length you can directly create a byte[] of the
correct size and keep reading into that. You can then construct the
output stream around that byte array with the MemoryStream(byte[])
constructor to avoid making a third copy of the data.
Mattias

Signature
Mattias Sjögren [C# MVP] mattias @ mvps.org
http://www.msjogren.net/dotnet/ | http://www.dotnetinterop.com
Please reply only to the newsgroup.
One thing that has yet to be mentioned as a reason why its better to use the
looping methodology is large objects ...
if you use the simple read/write methodology you end up with a byte array
the size of the data being created. If this byte size is over a certain
limit (I forget what the exact number is off the top of my head but want to
say 85k ??? someone feel free to correct me) it is treated as a large object
and put on a special heap. Large objects have differing life cycles and are
not garbage collected in the same way as normal objects .. this can lead to
a possible memory usage attack by someone forcing this operation to happen
repeatedly ... I have never tried to take an app down in this way but I
would imagine that it is feasible.
Cheers,
Greg Young
> hi,
>
[quoted text clipped - 23 lines]
> i wonder if there are any guidelines on this issue? should i read in a
> loop or can i read at once and the CLR will take care of everything?
Helge Jensen - 26 Apr 2006 08:46 GMT
> One thing that has yet to be mentioned as a reason why its better to use the
> looping methodology is large objects ...
The looping strategy is required for semantic reasons. A stream may
return less bytes than requested, even if the stream is not closed yet.
invoking s.Read(buf, 0, bug.Length) may return at any time there is any
data available, or the stream is closed.
In the real world not-looping will often break with for example
network-streams, where data arrive "slowly".
> if you use the simple read/write methodology you end up with a byte array
> the size of the data being created. If this byte size is over a certain
[quoted text clipped - 3 lines]
> not garbage collected in the same way as normal objects .. this can lead to
> a possible memory usage attack by someone forcing this operation to happen
Creating a large amount of large objects is also not good, but it's not
semanticly required.
> repeatedly ... I have never tried to take an app down in this way but I
> would imagine that it is feasible.
That can most certainly be done :) I have seen code (in effect) like:
while ( true ) {
using ( MemoryStream s = new MemoryStream() )
using ( Stream connection = awaitConnectionAndGetStream() ) {
// workaround Deserialize invoke s.Close()
Util.StreamCopy(connection, s);
try {
object o = formatter.Deserialize(s);
}
/// process and send reply
}
}
Conversing with this server for any significant amount of time,
transmitting a few objects of 85k the .NET runtime would break down with
an out-of-memory error.
If you expect (or risk receiving) large amounts of input, you should
process the data from that stream in a streaming manner, that is -- a
"little" at a time.
In the end i rewrote the above-code to use pass a proxy for the
connection to Deserialize. That proxy workarounds the Close, and some
other issues and it allows the caller to limit the amount of data that
can be read from the proxy.

Signature
Helge Jensen
mailto:helge.jensen@slog.dk
sip:helge.jensen@slog.dk
-=> Sebastian cover-music: http://ungdomshus.nu <=-
Greg Young [MVP] - 26 Apr 2006 08:51 GMT
I understand it is required. I was listing the large objects as an
additional benefit :)
"> One thing that has yet to be mentioned as a reason why its better to use
the
> looping methodology is large objects ..."
That is to say "in addition to" the other reasons mentioned.
Cheers,
Greg
>> One thing that has yet to be mentioned as a reason why its better to use
>> the
[quoted text clipped - 54 lines]
> other issues and it allows the caller to limit the amount of data that
> can be read from the proxy.
Helge Jensen - 26 Apr 2006 09:10 GMT
> I understand it is required. I was listing the large objects as an
> additional benefit :)
Okay, I see now.
<splitting hairs>
> "> One thing that has yet to be mentioned as a reason why its better to use
> the
>
>>looping methodology is large objects ..."
>
> That is to say "in addition to" the other reasons mentioned.
What happened was; I don't really see that there is any choice in the
matter, looping is required for correctness, not for optimization. That
made me choke a bit on the "conditional" way your reply was formulated:
"why *it's better* to use the looping..." [My emphasis]
It would have went down a treat as something along the lines of:
"In addition to the semantics/correcteness issues of not-looping,
there is the issue of large objects ..."
</splitting hairs>

Signature
Helge Jensen
mailto:helge.jensen@slog.dk
sip:helge.jensen@slog.dk
-=> Sebastian cover-music: http://ungdomshus.nu <=-
Greg Young [MVP] - 27 Apr 2006 03:47 GMT
<splittinghairs>
In the example given he was using memorystreams so it is not semantically
required.
It only becomes semantically required with a stream object that can return
only a portion of the available data :)
</splittinghairs>
>> I understand it is required. I was listing the large objects as an
>> additional benefit :)
[quoted text clipped - 22 lines]
>
> </splitting hairs>
Helge Jensen - 27 Apr 2006 08:10 GMT
> <splittinghairs>
> In the example given he was using memorystreams so it is not semantically
[quoted text clipped - 3 lines]
> only a portion of the available data :)
> </splittinghairs>
That's true :)

Signature
Helge Jensen
mailto:helge.jensen@slog.dk
sip:helge.jensen@slog.dk
-=> Sebastian cover-music: http://ungdomshus.nu <=-