Monday, October 15, 2012

Manipulating Zip Files with PeopleCode

I've seen a few forum posts that show how to zip files using both Exec and the XML Publisher PSXP_RPTDEFNMANAGER:Utility app package. Those are great options, but might not fit every scenario. Since the Java API includes support for zip files, let's investigate how we can use it to create or extract zip files.

Java allows developers to create zip files by writing data to a ZipOutputStream. We've used OutputStreams a few times on this blog to write data to files. A ZipOutputStream is just a wrapper around an OutputStream that writes contents in the zip file format. Here is an example of reading a text file and writing it out to a ZipOutputStream

REM ** The file I want to compress;
Local string &fileNameToZip = "c:\temp\blah.txt";

REM ** The internal zip file's structure -- internal location of blah.txt;
Local string &zipInternalPath = "my/internal/zip/folder/structure";

Local JavaObject &zip = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", "c:\temp\compressed.zip", True));

Local JavaObject &file = CreateJavaObject("java.io.File", &fileNameToZip);
REM ** We will read &fileNameToZip into a buffer and write it out to &zip;
Local JavaObject &buf = CreateJavaArray("byte[]", 1024);

Local number &byteCount;
Local JavaObject &in = CreateJavaObject("java.io.FileInputStream", &fileNameToZip);

Local JavaObject &zipEntry = CreateJavaObject("java.util.zip.ZipEntry", &zipInternalPath | "/" | &file.getName());

REM ** Make sure zip entry retains original modified date;
&zipEntry.setTime(&file.lastModified());

&zip.putNextEntry(&zipEntry);

&byteCount = &in.read(&buf);

While &byteCount > 0
   &zip.write(&buf, 0, &byteCount);
   &byteCount = &in.read(&buf);
End-While;

&in.close();
&zip.flush();
&zip.close();

To add multiple files to a single zip file, we can convert the above code into a function (preferably a FUNCLIB function) and then call it multiple times, once for each file:

Function AddFileToZip(&zipInternalPath, &fileNameToZip, &zip)
   Local JavaObject &file = CreateJavaObject("java.io.File", &fileNameToZip);
   REM ** We will read &fileNameToZip into a buffer and write it out to &zip;
   Local JavaObject &buf = CreateJavaArray("byte[]", 1024);
   
   Local number &byteCount;
   Local JavaObject &in = CreateJavaObject("java.io.FileInputStream", &fileNameToZip);
   
   Local JavaObject &zipEntry = CreateJavaObject("java.util.zip.ZipEntry", &zipInternalPath | "/" | &file.getName());
   
   REM ** Make sure zip entry retains original modified date;
   &zipEntry.setTime(&file.lastModified());
   
   &zip.putNextEntry(&zipEntry);
   
   &byteCount = &in.read(&buf);
   
   While &byteCount > 0
      &zip.write(&buf, 0, &byteCount);
      &byteCount = &in.read(&buf);
   End-While;
   
   &in.close();
End-Function;


Local JavaObject &zip = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", "c:\temp\compressed.zip", True));

AddFileToZip("folder1", "c:\temp\file1.txt", &zip);
AddFileToZip("folder1", "c:\temp\file2.txt", &zip);
AddFileToZip("folder2", "c:\temp\file1.txt", &zip);
AddFileToZip("folder2", "c:\temp\file2.txt", &zip);

&zip.flush();
&zip.close();

The contents to zip doesn't have to come from a static file in your file system. It could come from the database or... well, anywhere. Here is an example of zipping static text. In this example I intentionally left the internal zip file path (folder) blank to show how to create a zip file with no structure.

Local JavaObject &textToCompress = CreateJavaObject("java.lang.String", "This is some text to compress... probably a bloated XML document or something ;)");
Local string &zipInternalFileName = "contents.txt";

Local JavaObject &zip = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", "c:\temp\compressed.zip", True));
Local JavaObject &zipEntry = CreateJavaObject("java.util.zip.ZipEntry", &zipInternalFileName);
Local JavaObject &buf = &textToCompress.getBytes();
Local number &byteCount = &buf.length;

&zip.putNextEntry(&zipEntry);

&zip.write(&buf, 0, &byteCount);

&zip.flush();
&zip.close();

And, finally, unzipping files. The following example prints the text inside each file from a zip file named "compressed.zip" that contains four fictitious text files named file1.txt, file2.txt, file3.txt, and file4.txt.

Local JavaObject &zipFileInputStream = CreateJavaObject("java.io.FileInputStream", "c:\temp\compressed.zip");
Local JavaObject &zipInputStream = CreateJavaObject("java.util.zip.ZipInputStream", &zipFileInputStream);
Local JavaObject &zipEntry = &zipInputStream.getNextEntry();
Local JavaObject &buf = CreateJavaArray("byte[]", 1024);
Local number &byteCount;

While &zipEntry <> Null
   
   If (&zipEntry.isDirectory()) Then
      REM ** do nothing;
   Else
      Local JavaObject &out = CreateJavaObject("java.io.ByteArrayOutputStream");
      &byteCount = &zipInputStream.read(&buf);
      
      While &byteCount > 0
         &out.write(&buf, 0, &byteCount);
         &byteCount = &zipInputStream.read(&buf);
      End-While;
      
      &zipInputStream.closeEntry();
      MessageBox(0, "", 0, 0, &out.toString());
      /*Else
         &log.writeline("&zipEntry is a directory named " | &zipEntry.getName);*/
   End-If;
   
   &zipEntry = &zipInputStream.getNextEntry();
End-While;

&zipInputStream.close();
&zipFileInputStream.close();

What about unzipping binary files into the file system? I'll let you write that one.

Password protected zip files? Java doesn't make this easy. There are a few Java libraries, but as Chris Rigsby points out here, using non-standard Java classes (including your own) can be hazardous. At this time, it seems the best way to password protect a zip file is to use Exec to call a command line zip program. On Linux with the zip utility, use the -P parameter to encrypt with a password.

32 comments:

  1. Is there a significant benefit to doing it this way instead of calling the system unzip?

    ReplyDelete
  2. @Brett, good question. For some, there may be no difference.

    The main benefit is that it doesn't require any system utilities. For those that run on Linux, finding a compression utility or writing a shell script is trivial. For windows, however, there isn't a lot in the GPL/GNU space (except UnxUtils, of course). The example shown here uses the Java API which is guaranteed to be in your PeopleSoft system.

    The second benefit is shown in the third code listing. It shows how to stream information directly into a zip file without having to write to a file first. When it comes to distributed processing, servers, load balancing, etc, it is difficult to make assumptions for the file system. Now, what I didn't show was how to stream a zip out to the response object without ever writing to disk. That sounds like a good post for tonight :)

    ReplyDelete
  3. Hi Jim

    Few months back we had a similar requirement for one of our client and we end up developing two different scripts for unix and windows .

    Maintenance / development of batch script was just a nightmare..

    This looks like more cleaner solution.

    ReplyDelete
  4. @Neeraj, Thank you for the feedback. What you mention is a common scenario and why I went the Java/PeopleCode route instead of a batch file.

    If you have to password protect the zip files, then the solution is not as clean, but for standard compression, this is a good solution.

    Notice that I didn't have to use any reflection with the Java! Just nice, clean method calls.

    ReplyDelete
  5. Hi, your post inspired me to provide online some code I wrote a while back. - More specifically the zip password issue.

    I have a free library that provides zipping with or without passwords as well as checking if office documents are encrypted.

    The zip password is 'standard' encryption - so compatible with all zipping programs!

    The library is Java and there is a wrapper so that it works in PeopleSoft with relative ease.

    Example: &zipUtil.CreateZipFolderEncrypted(&FolderPathToZip, &ZipFileToSaveAs, &Password);


    http://users.adam.com.au/kane81/PeopleSoft/Utilities/

    ReplyDelete
  6. @Kane81, thanks for sharing!

    ReplyDelete
  7. Hi Jim,

    A very useful tutorial.

    Thanks for this post. I have expanded this tutorial to create TAR Files in PeopleCode, using JTAR library.

    ReplyDelete
  8. @Raajesh, thanks for sharing!

    ReplyDelete
  9. Hello jim,

    I am using the code to zip multiple pdf's into a zip file. But when I unzip the files, the pdf's are under sub-directories.

    e.g if a pdf file is under
    \\1.1.1\folder1\folder2\OutPut1.pdf
    \\1.1.1\folder1\folder2\OutOut2.pdf

    then the resultant zip file has the following structure

    1.1.1\folder1\folder2\OutPut1.pdf
    1.1.1\folder1\folder2\OutPut2.pdf

    Is there a way to not include the sub-directories but just the pdf files inside the zip files?

    ReplyDelete
  10. @Narender, yes, just set &zipInternalPath to "".

    ReplyDelete
  11. Is there any way to check the zip integrity, before to open or begin with the process ? let me try to explain...I recieve by sftp a lot of zip files ( and I process at night ), but this week, I have a problema with a big zip file... while the zip file was arriving my process begin to open at the same time, and I not receive any security error from OS. Its to say that I can open the zip file while it is transfered... I try several things but I until now, I don't how can check if the trasnfer of the file was already finish and I can open the zip file.

    ReplyDelete
  12. @ChiDONEt, interesting use case. I don't have an answer for you. If this is PeopleSoft related, you can try posting your question on the PeopleSoft OTN General Discussion forum.

    ReplyDelete
  13. Thanks Jim... I will post on PeopleSoft Forum Community...

    I know How to resolve, but I don't like my solution....check the file size of zip file and store on table, on the second run, if the size is the same, I can begin to process the zip file.

    ReplyDelete
  14. @ChiDONEt, I don't know if it is feasible, but an MD5 hash sum is a good way to verify files as well. If you can store that instead, or in addition to, that might be helpful. They do require additional time to generate, though.

    ReplyDelete
  15. Is it possible to multiple folders under a specified path into one zip file using Peoplecode?

    ReplyDelete
  16. @Sandeep, absolutely! It is easier if you are on PT 8.53 and can use the new Java ZipFileSystem.

    ReplyDelete
  17. @Jim - I am on PT 8.53. I have a requirement - There is a parent folder and it contains many child folders (Child Folder may or may not contains files like .doc/.pdf etc.). I have to create a zip contains these child folders and files within these child folders. Can you please guide me or provide sample code snippets for this. Appreciate your help in this regard.

    Thanks,
    Sandeep

    ReplyDelete
  18. @Sandeep, good idea. I'll add it to the list.

    ReplyDelete
  19. Jim,

    I was surprised that I couldn't find a thread on this topic already - so please excuse my placing this comment here.

    I'm creating a CSV file from data loaded into a temp table in an App Engine. This works flawlessly until you reach the requirement that any field in the CSV which is NULL should output a space:
    abc, ,1234, ,8/9/10, ,zyx

    I've ensured that the spaces are preserved in the record, but cannot seem to keep them when writing to the file. Any ideas?

    @Becca

    ReplyDelete
  20. @Becca, did you try enforcing quotes around text?

    ReplyDelete
  21. @Becca, the only other option I know of is to skip the FileLayout and write directly to a file by iterating over a set of rows and applying the appropriate delimiters yourself.

    ReplyDelete
  22. Jim,

    Since I'm not supposed to have any text qualifiers, it looks like I'm going to have to go with option #2. I kinda figured that was the case, but didn't want to write it off.

    Thanks again!

    @Becca

    ReplyDelete
  23. Anonymous9:34 PM

    Hey,
    I used the below method in AE-peoplecode, and its working fine , but my requiremnt is , I need both source and target file :

    Local JavaObject &source = CreateJavaObject("java.io.File", "/source/file.txt");
    Local JavaObject &target = CreateJavaObject("java.io.File", "/target/file.txt");
    &source.renameTo(&target);

    This method is working fine . but the issue is I need both source and destination copy.

    in this method , after moving the file , there is no copy file in my first folder .
    Is there any copy file option are there in java ,, can be use in AE--peoplcode .
    Even i used Putattachment function , Its throwing me an error like :
    An invalid parameter has been passed to the file attachment function. (2,788)

    my code :PutAttachment(URL.A, &sFileName, GetURL(URL.B) | &sFileName)

    please help me , I need file copy in both URL/folder.(Yes ,its generating file in url B)..

    Mike

    ReplyDelete
  24. @Mike, if you have PeopleTools 8.53 or later, you can use java.nio.files.File.copy

    ReplyDelete
  25. @Jim :

    Hi,

    I have requirement to generate multiple file in a single zip file . We are developing reports using BI publisher with XML as data source . I am able to generate multiple files but each file is opening in different browser tab ( in chrome ) .I want when we click button 'Run report' ( As in our case we have custom page and there is button to run the reports ) then all files should be generated in a single zip file .

    Can you please help me out on this ?

    Thanks in advance.

    ReplyDelete
  26. @Ankur, this post shows an old method for generating zip files. My preferred method as of PT 8.53 is to use Java NIO Zip File system support.

    ReplyDelete
  27. Hi Jim,
    I am sorry to post a late comments on this (by few years), but got in an issue after zip files are generated. I used the above code, and generated the output .zip files in process output directory, so as to make them available for downloading from 'View log/trace' link. .zip file is generated, along with original text file (which is being zipped).
    Problem comes when downloaded zip file is being extracted on local drive, getting error as "An unexpected error is keeping you from copying the file. If you continue to receive this error you can use the error code to search for help with this problem. Error 0x80070057: The Parameter is Incorrect"
    After this error, if extraction is cancelled, still the extract folder is created on local drive, and it contains proper folder structure defined, and contains files properly.
    Not really sure if it is caused by the code or its an issue or server level. Tried migrating the code to other development instance as well, but facing the same challenge. Also checked on other local systems, still same challenge.
    Not sure what I need to check in this case.
    Thanks a lot for all the great posts on PeopleSoft techniques!
    Always a follower.

    ReplyDelete
  28. Anonymous12:43 PM

    Hi Jim...
    As a PSoft Developer, I love your posts. They give great insight into some developing techniques we don't often get to see.
    Background: Our users have the ability to save multiple attachments for each employee inside the database. Now I need to develop a means to allow them to add all of the attachments (for a given employee) to a zip. I've implemented your code (modified slightly), but I'm running into a problem with code not finding the file(s). Is it possible to get an example of the code necessary to find the file(s) in the database?
    Thanks in advance.

    ReplyDelete
  29. Hi Jim, The zip file is getting created with dynamic names and generated on the server. How can I get the zip file on the process under view Log/Trace in PeopleSoft?

    Appreciate your help! Thank you!

    ReplyDelete
  30. Take a look at https://blog.jsmpros.com/2008/05/appengine-output-tricks-reporting.html. In there you will find the SQL "SELECT PRCSOUTPUTDIR FROM PSPRCSPARMS WHERE PRCSINSTANCE = %ProcessInstance"

    ReplyDelete
  31. Hi Jim,

    I have requirement to add multiple files to a same zip file. I am referring to your code. I am able to generate the zip file but while accessing it , it's giving error 'central directory not found'. I think it's issues with the internal folder path I am giving ( folder1).

    ReplyDelete
  32. Hi Jim,
    We have a requirement to zip the multiple reports generated by a BIP report that gets placed on the report manager. We have restricted access to report repository and hence can't access the files. Any other way can we zip the reports before it gets published on report manager.

    Appreciate your help! Thank you!

    ReplyDelete