Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / October 2007

Tip: Looking for answers? Try searching our database.

How to split a compressed file programmatically?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Brian Roisentul - 26 Oct 2007 14:46 GMT
Hi everyone,

I've been investigating for a while about this, but with no luck yet.

Does anybody know a way to do it?

Many thanks,

Brian
Peter Bromberg [C# MVP] - 26 Oct 2007 15:10 GMT
Well,
if you load a compressed file into a byte array, you can create two new byte
arrays and store the first half of the original bytes in one, and the second
half in the other, and then save them to the filesystem.  Are you asking "how
to do this?"
-- Peter
Recursion: see Recursion
site:  http://www.eggheadcafe.com
unBlog:  http://petesbloggerama.blogspot.com
BlogMetaFinder:    http://www.blogmetafinder.com

> Hi everyone,
>
[quoted text clipped - 5 lines]
>
> Brian
Brian Roisentul - 26 Oct 2007 15:58 GMT
Thanks for your answer Peter.

I'll explain a little bit what i'm trying to do, because maybe it can
be done without zipping files.

Basically, i have a big file(it's a SharePoint site's backup, *.fwp
file) in one server, that is moved to a ftp folder every night, and
downloaded from another pc. These tasks are run automatically by a c#
process I made.

The time of download is of about 3hs, so I thought splitting the file
into pieces, and then downloading them using threading, will make this
process quicker.

I tried to split the file without zipping it, but when I merge it
back, it doesn't seem to work. If you want I can paste my code here so
you can see it, because maybe I did something wrong.
Peter Duniho - 26 Oct 2007 18:10 GMT
>  [...]
>  The time of download is of about 3hs, so I thought splitting the
> file into pieces, and then downloading them using threading, will
> make thisprocess quicker.

Unless you can take advantage of multiple network connections by
downloading the pieces over those different network connections,
splitting them up isn't going to help at all.
>  I tried to split the file without zipping it, but when I merge it
> back, it doesn't seem to work. If you want I can paste my code here
> soyou can see it, because maybe I did something wrong.

Assuming you've saved all of the data and assuming you've reassembled
the pieces in the same order they were in the original data, it should
work.  However, before you spend any more time on this, I would
recommend you take advantage of the fact that you at least have now
multiple pieces of the original data.  Confirm, at least, that the
total size of the pieces is the same as the size of the original data
before splitting, and then compare download times for all of the
pieces with the download time for the original unsplit data.

Assuming you are like most of us and have a single network connection,
you will find no appreciable difference in download time.

Pete

Signature

I'm trying a new usenet client for Mac, Nemo OS X.
You can download it at http://www.malcom-mac.com/nemo

Peter Bromberg [C# MVP] - 26 Oct 2007 18:18 GMT
What I would probably do here is to use either Winzip or WinRar with a batch
file (they have DOS companions that accept batch comands) with "Store only" -
no-compression for speed, to split up the files. This .bat or .cmd file can
be run on a scheduled basis through Task Scheduler.
-- Peter
Recursion: see Recursion
site:  http://www.eggheadcafe.com
unBlog:  http://petesbloggerama.blogspot.com
BlogMetaFinder:    http://www.blogmetafinder.com

> Thanks for your answer Peter.
>
[quoted text clipped - 13 lines]
> back, it doesn't seem to work. If you want I can paste my code here so
> you can see it, because maybe I did something wrong.
Rad [Visual C# MVP] - 26 Oct 2007 22:12 GMT
>Hi everyone,
>
[quoted text clipped - 5 lines]
>
>Brian

Take a look at the 7zip SDK ... it may allow you to programmatically
create a spanned archive

--
http://bytes.thinkersroom.com
Brian Roisentul - 29 Oct 2007 13:47 GMT
> On Fri, 26 Oct 2007 06:46:33 -0700, Brian Roisentul
>
[quoted text clipped - 13 lines]
>
> --http://bytes.thinkersroom.com

I'll try this, thanks.

But I just wanted to know something also...

Spliting a file and then merging it back must work? Even with
"complex" files' types?

I'm asking this because I tried it with a ".txt" file and it worked,
but then, when i tried with an excel file with vb code inside, when i
opened it, it threw me a message like something was wrong, but then I
could see the grid's content, but the vb code wasn't there any more.

The file's size was the same than the original and all.

Then, when i tried it with a SharePoint's backup file (*.fwp) and
tried to restore a site through command line, i got an error saying
the file was corrupted or something like that.

Thanks again,

Brian
Brian Roisentul - 29 Oct 2007 15:45 GMT
> > On Fri, 26 Oct 2007 06:46:33 -0700, Brian Roisentul
>
[quoted text clipped - 37 lines]
>
> - Mostrar texto de la cita -

BTW, this is my code, maybe i did something wrong:

//Note: this.ext is a string property which represents the file's
extension.
/
****************************************************************************
                               splitFile procedure
****************************************************************************/
private void splitFile( string path, string path_parts, long
size_part )
        {
            long parts=0;

            try
            {
                using ( FileStream fs = new FileStream( path, FileMode.Open ) )
                {

                    flength = fs.Length;

                    parts = fs.Length / long.Parse( size_part.ToString() );

                    if ( parts > 0 )
                    {
                        for ( int i = 0; i < parts; i++ )
                        {
                            string filename= Path.Combine( path_parts, "spfile" +
i.ToString() + this.ext );

                            using ( FileStream fsOut = new FileStream( filename,
FileMode.Create ) )
                            {
                                byte[] arrBytes = new byte[ flength ];

                                int offSet = ( int ) (size_part * i ) + ( ( i>0 ) ? 1 : 0 );
                                int count = ( int ) size_part;

                                int ret = fs.Read( arrBytes, offSet, count );

                                fsOut.Write( arrBytes, offSet, count );

                                arrBytes = null;
                            }
                        }
                    }
                }
            }
            catch            {
                throw;
            }
                               }

/
****************************************************************************
                               buildFile procedure
****************************************************************************/
private void buildFile( string path_parts, long size_part, long
flength, string path_out )
        {
            try
            {
                byte[] arrBytes=null;

                string filename_out= Path.Combine( path_out, "spfile_out" +
this.ext );

                long parts = ( size_part != 0 ) ? flength /
long.Parse( size_part.ToString() ) : 0;

                byte[] arrBytes_out= new byte[ flength ];

                for ( int i = 0; i < parts; i++ )
                {
                    string filename= Path.Combine( path_parts, "spfile" +
i.ToString() + this.ext );

                    using ( FileStream fs = new FileStream( filename,
FileMode.Open ) )
                    {
                        arrBytes = new byte[ size_part ];//fs.Length

                        int ret = fs.Read( arrBytes, 0, Convert.ToInt32( size_part ) );//
fs.Length

                        arrBytes.CopyTo( arrBytes_out, i * size_part );
                    }
                }

                using ( FileStream fsOut = new FileStream( filename_out,
FileMode.Create ) )
                    fsOut.Write( arrBytes_out, 0, Convert.ToInt32( flength ) );
            }
            catch( Exception ex )
            {
                throw;
            }
        }
Peter Duniho - 29 Oct 2007 19:15 GMT
> BTW, this is my code, maybe i did something wrong:

The most serious issue I see is that you don't properly account for the
fractional remainder of the file when you calculate the number of
parts.  Your division will return the number of full-size partitions of
the file, but in most cases you'll have some extra bytes at the end
that you're not saving.

You should keep track of how many bytes you've actually written, and
then after writing out the full-size partitions, write a final
partition that's whatever's less.  Personally, I would forget about the
calculation altogether and just write a loop that keeps writing bytes
in chunks as large as you want or however many bytes you have
remaining, whichever is less, until you have no more bytes to write.  
But how exactly you do this isn't so important as making sure you do it
right.

There are other things about the code that are less-than-perfect (a
couple of examples are mentioned after this paragraph), but as near as
I can tell, the above is the most serious problem.  I didn't bother to
inspect the code that reconstructs the file, but assuming it was
written with similar care as the code that splits the file, it likely
has problems too.

I'm a bit bewildered at the "long.Parse(size_part.ToString())"
business.  The "size_part" variable is already a long; what possible
value is there in converting that to a string and then back to a long?

You also have a strange calculation that screws up the "offset"
variable; it turns out not to matter because you don't really need the
variable at all.  But it's still odd.

And why allocate a new buffer for each chunk you want to write, and why
does that buffer have to be the length of the original file, and given
that you're allocating a new buffer each time, why read the data
anywhere other than the beginning of the buffer?

For splitting the file, I would recommend code that looks more like this:

   void splitFile(string path, string path_parts, int size_part)
   {
       using (FileStream fs = new FileStream(path, FileMode.Open))
       {
           int ipart = 0;
           byte[] arrBytes = new byte[size_part];
           int cbRead;

           while ((cbRead = fs.Read(arrBytes, 0, size_part)) > 0)
           {
               string filename = Path.Combine(path_parts,
Path.GetFilenameWithoutExtension(path) + ipart.ToString() + this.ext);

               using (FileStream fsOut = new FileStream(filename,
FileMode.Create))
               {
                   fsOut.Write(arrBytes, 0, cbRead);
               }

               ipart++;
           }
       }
   }

In other words, just read chunks of the original file until you can't
read any more, writing them out one by one, each to a new file.

Reading the parts and reconstructing the file should be similarly
simple.  And remember, the more complicated you make the code, the
easier it is to create a bug in the code.  The single most important
thing you can do to ensure your code is correct and free of bugs is to
make it as simple as you can.

Pete
Brian Roisentul - 30 Oct 2007 14:58 GMT
> > BTW, this is my code, maybe i did something wrong:
>
[quoted text clipped - 69 lines]
>
> Pete

It worked!

Thanks for taking your time and for your good explanation. I learn a
lot from these kind of things.

I'm only 20 and I don't have much time programming, so i'm working
every day on improving my level.

Thanks everyone for helping :)

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.