.NET Forum / Languages / C# / October 2007
How to split a compressed file programmatically?
|
|
Thread rating:  |
Brian Roisentul - 26 Oct 2007 14:46 GMT Hi everyone,
I've been investigating for a while about this, but with no luck yet.
Does anybody know a way to do it?
Many thanks,
Brian
Peter Bromberg [C# MVP] - 26 Oct 2007 15:10 GMT Well, if you load a compressed file into a byte array, you can create two new byte arrays and store the first half of the original bytes in one, and the second half in the other, and then save them to the filesystem. Are you asking "how to do this?" -- Peter Recursion: see Recursion site: http://www.eggheadcafe.com unBlog: http://petesbloggerama.blogspot.com BlogMetaFinder: http://www.blogmetafinder.com
> Hi everyone, > [quoted text clipped - 5 lines] > > Brian Brian Roisentul - 26 Oct 2007 15:58 GMT Thanks for your answer Peter.
I'll explain a little bit what i'm trying to do, because maybe it can be done without zipping files.
Basically, i have a big file(it's a SharePoint site's backup, *.fwp file) in one server, that is moved to a ftp folder every night, and downloaded from another pc. These tasks are run automatically by a c# process I made.
The time of download is of about 3hs, so I thought splitting the file into pieces, and then downloading them using threading, will make this process quicker.
I tried to split the file without zipping it, but when I merge it back, it doesn't seem to work. If you want I can paste my code here so you can see it, because maybe I did something wrong.
Peter Duniho - 26 Oct 2007 18:10 GMT > [...] > The time of download is of about 3hs, so I thought splitting the > file into pieces, and then downloading them using threading, will > make thisprocess quicker. Unless you can take advantage of multiple network connections by downloading the pieces over those different network connections, splitting them up isn't going to help at all.
> I tried to split the file without zipping it, but when I merge it > back, it doesn't seem to work. If you want I can paste my code here > soyou can see it, because maybe I did something wrong. Assuming you've saved all of the data and assuming you've reassembled the pieces in the same order they were in the original data, it should work. However, before you spend any more time on this, I would recommend you take advantage of the fact that you at least have now multiple pieces of the original data. Confirm, at least, that the total size of the pieces is the same as the size of the original data before splitting, and then compare download times for all of the pieces with the download time for the original unsplit data.
Assuming you are like most of us and have a single network connection, you will find no appreciable difference in download time.
Pete
 Signature I'm trying a new usenet client for Mac, Nemo OS X. You can download it at http://www.malcom-mac.com/nemo
Peter Bromberg [C# MVP] - 26 Oct 2007 18:18 GMT What I would probably do here is to use either Winzip or WinRar with a batch file (they have DOS companions that accept batch comands) with "Store only" - no-compression for speed, to split up the files. This .bat or .cmd file can be run on a scheduled basis through Task Scheduler. -- Peter Recursion: see Recursion site: http://www.eggheadcafe.com unBlog: http://petesbloggerama.blogspot.com BlogMetaFinder: http://www.blogmetafinder.com
> Thanks for your answer Peter. > [quoted text clipped - 13 lines] > back, it doesn't seem to work. If you want I can paste my code here so > you can see it, because maybe I did something wrong. Rad [Visual C# MVP] - 26 Oct 2007 22:12 GMT >Hi everyone, > [quoted text clipped - 5 lines] > >Brian Take a look at the 7zip SDK ... it may allow you to programmatically create a spanned archive
-- http://bytes.thinkersroom.com
Brian Roisentul - 29 Oct 2007 13:47 GMT > On Fri, 26 Oct 2007 06:46:33 -0700, Brian Roisentul > [quoted text clipped - 13 lines] > > --http://bytes.thinkersroom.com I'll try this, thanks.
But I just wanted to know something also...
Spliting a file and then merging it back must work? Even with "complex" files' types?
I'm asking this because I tried it with a ".txt" file and it worked, but then, when i tried with an excel file with vb code inside, when i opened it, it threw me a message like something was wrong, but then I could see the grid's content, but the vb code wasn't there any more.
The file's size was the same than the original and all.
Then, when i tried it with a SharePoint's backup file (*.fwp) and tried to restore a site through command line, i got an error saying the file was corrupted or something like that.
Thanks again,
Brian
Brian Roisentul - 29 Oct 2007 15:45 GMT > > On Fri, 26 Oct 2007 06:46:33 -0700, Brian Roisentul > [quoted text clipped - 37 lines] > > - Mostrar texto de la cita - BTW, this is my code, maybe i did something wrong:
//Note: this.ext is a string property which represents the file's extension. / **************************************************************************** splitFile procedure ****************************************************************************/ private void splitFile( string path, string path_parts, long size_part ) { long parts=0;
try { using ( FileStream fs = new FileStream( path, FileMode.Open ) ) {
flength = fs.Length;
parts = fs.Length / long.Parse( size_part.ToString() );
if ( parts > 0 ) { for ( int i = 0; i < parts; i++ ) { string filename= Path.Combine( path_parts, "spfile" + i.ToString() + this.ext );
using ( FileStream fsOut = new FileStream( filename, FileMode.Create ) ) { byte[] arrBytes = new byte[ flength ];
int offSet = ( int ) (size_part * i ) + ( ( i>0 ) ? 1 : 0 ); int count = ( int ) size_part;
int ret = fs.Read( arrBytes, offSet, count );
fsOut.Write( arrBytes, offSet, count );
arrBytes = null; } } } } } catch { throw; } }
/ **************************************************************************** buildFile procedure ****************************************************************************/ private void buildFile( string path_parts, long size_part, long flength, string path_out ) { try { byte[] arrBytes=null;
string filename_out= Path.Combine( path_out, "spfile_out" + this.ext );
long parts = ( size_part != 0 ) ? flength / long.Parse( size_part.ToString() ) : 0;
byte[] arrBytes_out= new byte[ flength ];
for ( int i = 0; i < parts; i++ ) { string filename= Path.Combine( path_parts, "spfile" + i.ToString() + this.ext );
using ( FileStream fs = new FileStream( filename, FileMode.Open ) ) { arrBytes = new byte[ size_part ];//fs.Length
int ret = fs.Read( arrBytes, 0, Convert.ToInt32( size_part ) );// fs.Length
arrBytes.CopyTo( arrBytes_out, i * size_part ); } }
using ( FileStream fsOut = new FileStream( filename_out, FileMode.Create ) ) fsOut.Write( arrBytes_out, 0, Convert.ToInt32( flength ) ); } catch( Exception ex ) { throw; } }
Peter Duniho - 29 Oct 2007 19:15 GMT > BTW, this is my code, maybe i did something wrong: The most serious issue I see is that you don't properly account for the fractional remainder of the file when you calculate the number of parts. Your division will return the number of full-size partitions of the file, but in most cases you'll have some extra bytes at the end that you're not saving.
You should keep track of how many bytes you've actually written, and then after writing out the full-size partitions, write a final partition that's whatever's less. Personally, I would forget about the calculation altogether and just write a loop that keeps writing bytes in chunks as large as you want or however many bytes you have remaining, whichever is less, until you have no more bytes to write. But how exactly you do this isn't so important as making sure you do it right.
There are other things about the code that are less-than-perfect (a couple of examples are mentioned after this paragraph), but as near as I can tell, the above is the most serious problem. I didn't bother to inspect the code that reconstructs the file, but assuming it was written with similar care as the code that splits the file, it likely has problems too.
I'm a bit bewildered at the "long.Parse(size_part.ToString())" business. The "size_part" variable is already a long; what possible value is there in converting that to a string and then back to a long?
You also have a strange calculation that screws up the "offset" variable; it turns out not to matter because you don't really need the variable at all. But it's still odd.
And why allocate a new buffer for each chunk you want to write, and why does that buffer have to be the length of the original file, and given that you're allocating a new buffer each time, why read the data anywhere other than the beginning of the buffer?
For splitting the file, I would recommend code that looks more like this:
void splitFile(string path, string path_parts, int size_part) { using (FileStream fs = new FileStream(path, FileMode.Open)) { int ipart = 0; byte[] arrBytes = new byte[size_part]; int cbRead;
while ((cbRead = fs.Read(arrBytes, 0, size_part)) > 0) { string filename = Path.Combine(path_parts, Path.GetFilenameWithoutExtension(path) + ipart.ToString() + this.ext);
using (FileStream fsOut = new FileStream(filename, FileMode.Create)) { fsOut.Write(arrBytes, 0, cbRead); }
ipart++; } } }
In other words, just read chunks of the original file until you can't read any more, writing them out one by one, each to a new file.
Reading the parts and reconstructing the file should be similarly simple. And remember, the more complicated you make the code, the easier it is to create a bug in the code. The single most important thing you can do to ensure your code is correct and free of bugs is to make it as simple as you can.
Pete
Brian Roisentul - 30 Oct 2007 14:58 GMT > > BTW, this is my code, maybe i did something wrong: > [quoted text clipped - 69 lines] > > Pete It worked!
Thanks for taking your time and for your good explanation. I learn a lot from these kind of things.
I'm only 20 and I don't have much time programming, so i'm working every day on improving my level.
Thanks everyone for helping :)
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|