Path: inforamp.net!coach07.inforamp.net!user From: poynton@poynton.com (Charles Poynton) Newsgroups: sci.engr.advanced-tv Subject: PhotoCD compression Date: Wed, 01 Feb 1995 23:06:35 -0500 Organization: Poynton Vector Lines: 66 Message-ID: NNTP-Posting-Host: coach07.inforamp.net Cleaning out the mailbox, Craig wrote in a followup about PhotoCD compression, >Thanks for the details on the PhotoCD format. I am fairly certain that the >lowest sub-band that is stored is for the 768 x 512 level and that the others >are created on the fly...but I could also be worng about this. How can I say this politely ... ah, let's see ... how about ... you're wrong. The following is all interpreted from information published by Kodak -- no secrets here. First, understand that the encoder has a copy of the decoder. Sort of like MPEG. The encoder knows what the decoder will do at every point. The encoder downsamples the 3072x2048 input image data with really good 2:1 spatial downsampling filters, first to 1536x1024, then in a second pass to 768x512. The 768x512 ("1base") image is stored directly, uncompressed. Then third and fourth passes compute 384x256 ("base/4") and 192x128 ("base/16") images. These are also stored directly, uncompressed. Each pass ultimately takes a 2x2 block of pixels down to 1 pixel, but the filters are larger than this; the regions overlap. To compute the compressed 4base (1536x1024) image data to be stored, the encoder upsamples the 1base uncompressed image using a cheap upsampler, then compares that to the reference 1536x1024. It computes the differences ("deltas"), Huffman codes the differences, and stores that. If the deltas occupy too much space, it quantizes them, but this is very rare. To compute the compressed 16base (3072x2048) image data to be stored, the encoder upsamples the decompressed 4base image using a cheap upsampler, then compares that to the original 3072x2048 image data. It computes the differences, Huffman codes the differences, and stores those as deltas. If the deltas occupy too much space, they are quantized (and thereby made less voluminous). This is rare, but not quite so rare as in the 4base case, that is, if detail needs to be discarded it is discarded first at the highest spatial frequencies. That mechanism has not been published. The file format has not been published either. You may wonder why the encoder uses cheap upsamplers. Well, those are exactly the upsampling operations done in a decoder -- possibly at consumer electronics prices -- and those the encoder must simulate exactly. A PhotoCD decompressor takes the 1base 768x512 uncompressed image as a starting point. It interpolates up to 4base 1536x1024 using its cheap filter, then Huffman-decodes the deltas and applies them (by addition). Then it upsamples THIS using the same cheap filter, Huffman-decodes the next level of deltas and applies them (by addition) and that's the decompressed 3072x2048 image. All of the computation and storage is done in PhotoYCC color space, documented in ... let's see, I just pulled a FAQ about this off the net somewhere ... where is that damn thing ... well, I can't find it right now but I'm sure it's here somewhere. C. -- Charles Poynton vox: +1 416 486 3271 fax: +1 416 486 3657 poynton@poynton.com [preferred, Mac Eudora, MIME, BinHqx]