Monday, October 15, 2012

Manipulating Zip Files with PeopleCode

I've seen a few forum posts that show how to zip files using both Exec and the XML Publisher PSXP_RPTDEFNMANAGER:Utility app package. Those are great options, but might not fit every scenario. Since the Java API includes support for zip files, let's investigate how we can use it to create or extract zip files.

Java allows developers to create zip files by writing data to a ZipOutputStream. We've used OutputStreams a few times on this blog to write data to files. A ZipOutputStream is just a wrapper around an OutputStream that writes contents in the zip file format. Here is an example of reading a text file and writing it out to a ZipOutputStream

REM ** The file I want to compress;
Local string &fileNameToZip = "c:\temp\blah.txt";

REM ** The internal zip file's structure -- internal location of blah.txt;
Local string &zipInternalPath = "my/internal/zip/folder/structure";

Local JavaObject &zip = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", "c:\temp\compressed.zip", True));

Local JavaObject &file = CreateJavaObject("java.io.File", &fileNameToZip);
REM ** We will read &fileNameToZip into a buffer and write it out to &zip;
Local JavaObject &buf = CreateJavaArray("byte[]", 1024);

Local number &byteCount;
Local JavaObject &in = CreateJavaObject("java.io.FileInputStream", &fileNameToZip);

Local JavaObject &zipEntry = CreateJavaObject("java.util.zip.ZipEntry", &zipInternalPath | "/" | &file.getName());

REM ** Make sure zip entry retains original modified date;
&zipEntry.setTime(&file.lastModified());

&zip.putNextEntry(&zipEntry);

&byteCount = &in.read(&buf);

While &byteCount > 0
   &zip.write(&buf, 0, &byteCount);
   &byteCount = &in.read(&buf);
End-While;

&in.close();
&zip.flush();
&zip.close();

To add multiple files to a single zip file, we can convert the above code into a function (preferably a FUNCLIB function) and then call it multiple times, once for each file:

Function AddFileToZip(&zipInternalPath, &fileNameToZip, &zip)
   Local JavaObject &file = CreateJavaObject("java.io.File", &fileNameToZip);
   REM ** We will read &fileNameToZip into a buffer and write it out to &zip;
   Local JavaObject &buf = CreateJavaArray("byte[]", 1024);
   
   Local number &byteCount;
   Local JavaObject &in = CreateJavaObject("java.io.FileInputStream", &fileNameToZip);
   
   Local JavaObject &zipEntry = CreateJavaObject("java.util.zip.ZipEntry", &zipInternalPath | "/" | &file.getName());
   
   REM ** Make sure zip entry retains original modified date;
   &zipEntry.setTime(&file.lastModified());
   
   &zip.putNextEntry(&zipEntry);
   
   &byteCount = &in.read(&buf);
   
   While &byteCount > 0
      &zip.write(&buf, 0, &byteCount);
      &byteCount = &in.read(&buf);
   End-While;
   
   &in.close();
End-Function;


Local JavaObject &zip = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", "c:\temp\compressed.zip", True));

AddFileToZip("folder1", "c:\temp\file1.txt", &zip);
AddFileToZip("folder1", "c:\temp\file2.txt", &zip);
AddFileToZip("folder2", "c:\temp\file1.txt", &zip);
AddFileToZip("folder2", "c:\temp\file2.txt", &zip);

&zip.flush();
&zip.close();

The contents to zip doesn't have to come from a static file in your file system. It could come from the database or... well, anywhere. Here is an example of zipping static text. In this example I intentionally left the internal zip file path (folder) blank to show how to create a zip file with no structure.

Local JavaObject &textToCompress = CreateJavaObject("java.lang.String", "This is some text to compress... probably a bloated XML document or something ;)");
Local string &zipInternalFileName = "contents.txt";

Local JavaObject &zip = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", "c:\temp\compressed.zip", True));
Local JavaObject &zipEntry = CreateJavaObject("java.util.zip.ZipEntry", &zipInternalFileName);
Local JavaObject &buf = &textToCompress.getBytes();
Local number &byteCount = &buf.length;

&zip.putNextEntry(&zipEntry);

&zip.write(&buf, 0, &byteCount);

&zip.flush();
&zip.close();

And, finally, unzipping files. The following example prints the text inside each file from a zip file named "compressed.zip" that contains four fictitious text files named file1.txt, file2.txt, file3.txt, and file4.txt.

Local JavaObject &zipFileInputStream = CreateJavaObject("java.io.FileInputStream", "c:\temp\compressed.zip");
Local JavaObject &zipInputStream = CreateJavaObject("java.util.zip.ZipInputStream", &zipFileInputStream);
Local JavaObject &zipEntry = &zipInputStream.getNextEntry();
Local JavaObject &buf = CreateJavaArray("byte[]", 1024);
Local number &byteCount;

While &zipEntry <> Null
   
   If (&zipEntry.isDirectory()) Then
      REM ** do nothing;
   Else
      Local JavaObject &out = CreateJavaObject("java.io.ByteArrayOutputStream");
      &byteCount = &zipInputStream.read(&buf);
      
      While &byteCount > 0
         &out.write(&buf, 0, &byteCount);
         &byteCount = &zipInputStream.read(&buf);
      End-While;
      
      &zipInputStream.closeEntry();
      MessageBox(0, "", 0, 0, &out.toString());
      /*Else
         &log.writeline("&zipEntry is a directory named " | &zipEntry.getName);*/
   End-If;
   
   &zipEntry = &zipInputStream.getNextEntry();
End-While;

&zipInputStream.close();
&zipFileInputStream.close();

What about unzipping binary files into the file system? I'll let you write that one.

Password protected zip files? Java doesn't make this easy. There are a few Java libraries, but as Chris Rigsby points out here, using non-standard Java classes (including your own) can be hazardous. At this time, it seems the best way to password protect a zip file is to use Exec to call a command line zip program. On Linux with the zip utility, use the -P parameter to encrypt with a password.

32 comments:

Brett B said...

Is there a significant benefit to doing it this way instead of calling the system unzip?

Jim Marion said...

@Brett, good question. For some, there may be no difference.

The main benefit is that it doesn't require any system utilities. For those that run on Linux, finding a compression utility or writing a shell script is trivial. For windows, however, there isn't a lot in the GPL/GNU space (except UnxUtils, of course). The example shown here uses the Java API which is guaranteed to be in your PeopleSoft system.

The second benefit is shown in the third code listing. It shows how to stream information directly into a zip file without having to write to a file first. When it comes to distributed processing, servers, load balancing, etc, it is difficult to make assumptions for the file system. Now, what I didn't show was how to stream a zip out to the response object without ever writing to disk. That sounds like a good post for tonight :)

Neeraj Kholiya said...

Hi Jim

Few months back we had a similar requirement for one of our client and we end up developing two different scripts for unix and windows .

Maintenance / development of batch script was just a nightmare..

This looks like more cleaner solution.

Jim Marion said...

@Neeraj, Thank you for the feedback. What you mention is a common scenario and why I went the Java/PeopleCode route instead of a batch file.

If you have to password protect the zip files, then the solution is not as clean, but for standard compression, this is a good solution.

Notice that I didn't have to use any reflection with the Java! Just nice, clean method calls.

kane81 said...

Hi, your post inspired me to provide online some code I wrote a while back. - More specifically the zip password issue.

I have a free library that provides zipping with or without passwords as well as checking if office documents are encrypted.

The zip password is 'standard' encryption - so compatible with all zipping programs!

The library is Java and there is a wrapper so that it works in PeopleSoft with relative ease.

Example: &zipUtil.CreateZipFolderEncrypted(&FolderPathToZip, &ZipFileToSaveAs, &Password);


http://users.adam.com.au/kane81/PeopleSoft/Utilities/

Jim Marion said...

@Kane81, thanks for sharing!

Raajesh said...

Hi Jim,

A very useful tutorial.

Thanks for this post. I have expanded this tutorial to create TAR Files in PeopleCode, using JTAR library.

Jim Marion said...

@Raajesh, thanks for sharing!

Unknown said...

Hello jim,

I am using the code to zip multiple pdf's into a zip file. But when I unzip the files, the pdf's are under sub-directories.

e.g if a pdf file is under
\\1.1.1\folder1\folder2\OutPut1.pdf
\\1.1.1\folder1\folder2\OutOut2.pdf

then the resultant zip file has the following structure

1.1.1\folder1\folder2\OutPut1.pdf
1.1.1\folder1\folder2\OutPut2.pdf

Is there a way to not include the sub-directories but just the pdf files inside the zip files?

Jim Marion said...

@Narender, yes, just set &zipInternalPath to "".

ChiDONEt said...

Is there any way to check the zip integrity, before to open or begin with the process ? let me try to explain...I recieve by sftp a lot of zip files ( and I process at night ), but this week, I have a problema with a big zip file... while the zip file was arriving my process begin to open at the same time, and I not receive any security error from OS. Its to say that I can open the zip file while it is transfered... I try several things but I until now, I don't how can check if the trasnfer of the file was already finish and I can open the zip file.

Jim Marion said...

@ChiDONEt, interesting use case. I don't have an answer for you. If this is PeopleSoft related, you can try posting your question on the PeopleSoft OTN General Discussion forum.

ChiDONEt said...

Thanks Jim... I will post on PeopleSoft Forum Community...

I know How to resolve, but I don't like my solution....check the file size of zip file and store on table, on the second run, if the size is the same, I can begin to process the zip file.

Jim Marion said...

@ChiDONEt, I don't know if it is feasible, but an MD5 hash sum is a good way to verify files as well. If you can store that instead, or in addition to, that might be helpful. They do require additional time to generate, though.

Unknown said...

Is it possible to multiple folders under a specified path into one zip file using Peoplecode?

Jim Marion said...

@Sandeep, absolutely! It is easier if you are on PT 8.53 and can use the new Java ZipFileSystem.

Unknown said...

@Jim - I am on PT 8.53. I have a requirement - There is a parent folder and it contains many child folders (Child Folder may or may not contains files like .doc/.pdf etc.). I have to create a zip contains these child folders and files within these child folders. Can you please guide me or provide sample code snippets for this. Appreciate your help in this regard.

Thanks,
Sandeep

Jim Marion said...

@Sandeep, good idea. I'll add it to the list.

Becca said...

Jim,

I was surprised that I couldn't find a thread on this topic already - so please excuse my placing this comment here.

I'm creating a CSV file from data loaded into a temp table in an App Engine. This works flawlessly until you reach the requirement that any field in the CSV which is NULL should output a space:
abc, ,1234, ,8/9/10, ,zyx

I've ensured that the spaces are preserved in the record, but cannot seem to keep them when writing to the file. Any ideas?

@Becca

Jim Marion said...

@Becca, did you try enforcing quotes around text?

Jim Marion said...

@Becca, the only other option I know of is to skip the FileLayout and write directly to a file by iterating over a set of rows and applying the appropriate delimiters yourself.

Becca said...

Jim,

Since I'm not supposed to have any text qualifiers, it looks like I'm going to have to go with option #2. I kinda figured that was the case, but didn't want to write it off.

Thanks again!

@Becca

Anonymous said...

Hey,
I used the below method in AE-peoplecode, and its working fine , but my requiremnt is , I need both source and target file :

Local JavaObject &source = CreateJavaObject("java.io.File", "/source/file.txt");
Local JavaObject &target = CreateJavaObject("java.io.File", "/target/file.txt");
&source.renameTo(&target);

This method is working fine . but the issue is I need both source and destination copy.

in this method , after moving the file , there is no copy file in my first folder .
Is there any copy file option are there in java ,, can be use in AE--peoplcode .
Even i used Putattachment function , Its throwing me an error like :
An invalid parameter has been passed to the file attachment function. (2,788)

my code :PutAttachment(URL.A, &sFileName, GetURL(URL.B) | &sFileName)

please help me , I need file copy in both URL/folder.(Yes ,its generating file in url B)..

Mike

Jim Marion said...

@Mike, if you have PeopleTools 8.53 or later, you can use java.nio.files.File.copy

Ankur said...

@Jim :

Hi,

I have requirement to generate multiple file in a single zip file . We are developing reports using BI publisher with XML as data source . I am able to generate multiple files but each file is opening in different browser tab ( in chrome ) .I want when we click button 'Run report' ( As in our case we have custom page and there is button to run the reports ) then all files should be generated in a single zip file .

Can you please help me out on this ?

Thanks in advance.

Jim Marion said...

@Ankur, this post shows an old method for generating zip files. My preferred method as of PT 8.53 is to use Java NIO Zip File system support.

Sumant said...

Hi Jim,
I am sorry to post a late comments on this (by few years), but got in an issue after zip files are generated. I used the above code, and generated the output .zip files in process output directory, so as to make them available for downloading from 'View log/trace' link. .zip file is generated, along with original text file (which is being zipped).
Problem comes when downloaded zip file is being extracted on local drive, getting error as "An unexpected error is keeping you from copying the file. If you continue to receive this error you can use the error code to search for help with this problem. Error 0x80070057: The Parameter is Incorrect"
After this error, if extraction is cancelled, still the extract folder is created on local drive, and it contains proper folder structure defined, and contains files properly.
Not really sure if it is caused by the code or its an issue or server level. Tried migrating the code to other development instance as well, but facing the same challenge. Also checked on other local systems, still same challenge.
Not sure what I need to check in this case.
Thanks a lot for all the great posts on PeopleSoft techniques!
Always a follower.

Anonymous said...

Hi Jim...
As a PSoft Developer, I love your posts. They give great insight into some developing techniques we don't often get to see.
Background: Our users have the ability to save multiple attachments for each employee inside the database. Now I need to develop a means to allow them to add all of the attachments (for a given employee) to a zip. I've implemented your code (modified slightly), but I'm running into a problem with code not finding the file(s). Is it possible to get an example of the code necessary to find the file(s) in the database?
Thanks in advance.

Sagaya A said...

Hi Jim, The zip file is getting created with dynamic names and generated on the server. How can I get the zip file on the process under view Log/Trace in PeopleSoft?

Appreciate your help! Thank you!

Jim Marion said...

Take a look at https://blog.jsmpros.com/2008/05/appengine-output-tricks-reporting.html. In there you will find the SQL "SELECT PRCSOUTPUTDIR FROM PSPRCSPARMS WHERE PRCSINSTANCE = %ProcessInstance"

Shubham said...

Hi Jim,

I have requirement to add multiple files to a same zip file. I am referring to your code. I am able to generate the zip file but while accessing it , it's giving error 'central directory not found'. I think it's issues with the internal folder path I am giving ( folder1).

Suma said...

Hi Jim,
We have a requirement to zip the multiple reports generated by a BIP report that gets placed on the report manager. We have restricted access to report repository and hence can't access the files. Any other way can we zip the reports before it gets published on report manager.

Appreciate your help! Thank you!