Splitting files and images

One function we might find useful would be the ability to split images up into usable chunks, either for archiving or for use in another program. We will first discuss usingspliton its own, then in conjunction withddfor “on the fly” splitting.

For example, you might have a 10GB image that you want to split into 640MB parts so they can be written to CD-R media. Or, if you use a program such as_Ilook__Investigator_and need files no larger than 2GB (for a fat32 partition), you might want to split the image into 2GB pieces. For this we use thesplitcommand.

splitnormally works on lines of input (i.e. from a text file). But if we use the–boption, we force split to treat the file as_binary_input and lines are ignored. We can specify the size of the files we want along with the prefix we want for the output files. The command looks like:

split –b XXm

**Illegal HTML tag removed :**

whereXXis the size of the resulting files. For example, if we have a 6GB image calledimage.disk1.dd, we can split it into 2GB files using the following command:

split –b 2000m image.disk1.dd image.split.

This would result in 3 files (2GB in size) each named with the prefix “image.split.” as specified in the command, followed by “aa”, “ab”, “ac”, and so on:

image.split.aa image.split.ab

image.split.ac

The process can be reversed. If we want to reassemble the image from the split parts (from CD-R, etc.), we can use thecatcommand and redirect the output to a new file. Remembercatsimply streams the specified files to standard output. If you redirect this output, the files are assembled into one.

cat image.split.aa image.split.ab image.split.ac > image.newor

cat image.split.a* > image.new

Another way of accomplishing this would be to split the image as we create it (i.e. from addcommand). This is essentially the “on the fly” splitting we mentioned earlier. We do this by piping the output of theddcommand straight tosplit.

dd if=/dev/hdx | split –b 2000m – image.split.

In this case, instead of giving the name of the file to be split in thesplitcommand, we give a simple “-“. The single dash is a descriptor that means “standard input”. In other words, the command is taking its input from the data pipe provided by the standard output ofdd.

Once we have the image, the same technique usingcatwill allow us to reassemble it for hashing or analysis.

For practice, let’s take the practical exercise floppy disk we used earlier and try this method on that disk, splitting it into 360k pieces:

sha1sum /dev/fd0f5ee9cf56f23e5f5773e2a4854360404a62015cf /dev/fd0

dd if=/dev/fd0 | split –b 360k – floppy.split.

2880+0 records in

2880+0 records out

  • remember, the “records” are 512 byte blocks ( times 2880 = 1.44Mb)

ls –lh

-rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.aa

-rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.ab -rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.ac

-rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.ad

cat floppy.split.a* | sha1sumf5ee9c__f56f23e5f5773e2a4854360404a62015cf -

(The out put of this command shows a “-“ in place of the filename. This represents the fact that the hash was calculated from “standard input” tosha1sum, not a file or device)

cat floppy.split.a > new.floppy.ima*ge ls -lh**

-rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.aa

-rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.ab -rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.ac

-rw-r--r-- 1 root root 360k Aug 14 08:13 floppy.split.ad

-rw-r--r-- 1 root root 1.4M Aug 14 08:14 new.floppy.image

sha1sum new.floppy.imagef5ee9cf56f23e5f5773e2a4854360404a62015cf new.floppy.image

Looking at the output of the above commands, we see that all the sha1sum’s match. We find the same hash for the disk, for the split images “cat-ed” together, and for the newly reassembled image.

results matching ""

    No results matching ""