compression tricks for storing terabytes of images?

classic Classic list List threaded Threaded
5 messages Options
Emmanuel Levy Emmanuel Levy
Reply | Threaded
Open this post in threaded view
|

compression tricks for storing terabytes of images?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Dear All,

I'm wondering if some of you may have suggestions regarding archiving large
image datasets?

So far I've come up to the conclusion that bzip2 is the most convenient
compression solution to use because it's lossless and it works on any type
of data (e.g., tif, stacks) and maintains the metadata.

If anyone knows of good and easy to implement alternatives I'll be very
happy to hear about it.

Thanks,

All the best,

Emmanuel
Roger Leigh Roger Leigh
Reply | Threaded
Open this post in threaded view
|

Re: compression tricks for storing terabytes of images?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

On 21/03/2016 10:37, Emmanuel Levy wrote:
> I'm wondering if some of you may have suggestions regarding archiving large
> image datasets?
>
> So far I've come up to the conclusion that bzip2 is the most convenient
> compression solution to use because it's lossless and it works on any type
> of data (e.g., tif, stacks) and maintains the metadata.
>
> If anyone knows of good and easy to implement alternatives I'll be very
> happy to hear about it.

For archival you might find "xz" results in smaller filesizes
(https://en.wikipedia.org/wiki/Xz).  It's also a lossless compression
algorithm
(https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm).
  I have used xz compression to archive all my own data (as tar.xz)
without encountering any problems.

Some discussion and benchmarks you might find interesting:

http://unix.stackexchange.com/questions/108100/why-are-tar-archive-formats-switching-to-xz-compression-to-replace-bzip2-and-wha
   https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-comparison/
   http://tukaani.org/lzma/benchmarks.html

xz has essentially replaced bzip2 for most general uses, the exception
being plain text (and hence genomic data) for which its compression
algorithm is optimised.  For images, you'll most likely get better
compression and faster decompression with xz; the tradeoff being that it
is a bit slower at compression, but if the goal is the smallest possible
filesize then it's likely worth the tradeoff.

It would probably be worth testing bzip2 and xz at different compression
levels to see how much benefit you see with your own data.


Kind regards,
Roger

--
Dr Roger Leigh -- Open Microscopy Environment
Wellcome Trust Centre for Gene Regulation and Expression,
School of Life Sciences, University of Dundee, Dow Street,
Dundee DD1 5EH Scotland UK   Tel: (01382) 386364

The University of Dundee is a registered Scottish Charity, No: SC015096
Douglas Richardson Douglas Richardson
Reply | Threaded
Open this post in threaded view
|

Re: compression tricks for storing terabytes of images?

In reply to this post by Emmanuel Levy
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Phil Keller's lab has a developed an open source compression format: .klb

It also has a FIJI/ImageJ reader.

You can find it on their lab's webpage here:
https://www.janelia.org/lab/keller-lab/software/efficient-processing-and-analysis-large-scale-light-sheet-microscopy-data

And in their Nature Protocols paper: Amat et al 2015

The paper is specific to lightsheet data, but the converter should work for
most data sets.

-Doug

On Mon, Mar 21, 2016 at 6:37 AM, Emmanuel Levy <[hidden email]>
wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Dear All,
>
> I'm wondering if some of you may have suggestions regarding archiving large
> image datasets?
>
> So far I've come up to the conclusion that bzip2 is the most convenient
> compression solution to use because it's lossless and it works on any type
> of data (e.g., tif, stacks) and maintains the metadata.
>
> If anyone knows of good and easy to implement alternatives I'll be very
> happy to hear about it.
>
> Thanks,
>
> All the best,
>
> Emmanuel
>
Kurt Thorn Kurt Thorn
Reply | Threaded
Open this post in threaded view
|

Re: compression tricks for storing terabytes of images?

In reply to this post by Emmanuel Levy
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

On 3/21/2016 3:37 AM, Emmanuel Levy wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Dear All,
>
> I'm wondering if some of you may have suggestions regarding archiving large
> image datasets?
>
> So far I've come up to the conclusion that bzip2 is the most convenient
> compression solution to use because it's lossless and it works on any type
> of data (e.g., tif, stacks) and maintains the metadata.
>
> If anyone knows of good and easy to implement alternatives I'll be very
> happy to hear about it.
>
> Thanks,
>
> All the best,
>
> Emmanuel
>
I recently learned about FLIF (http://flif.info/) which is a very nice
compression scheme for 2D image data.  Unfortunately, it doesn't support
stacks and metadata but it seems to perform better than most other
compression schemes on a wide range of images.  It could be the core of
a nice compression solution for scientific images.  Unfortunately, it
doesn't seem like any compression scheme is going to deliver better than
about a 2-fold space savings.

Kurt

--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/
Emmanuel Levy Emmanuel Levy
Reply | Threaded
Open this post in threaded view
|

Re: compression tricks for storing terabytes of images?

In reply to this post by Roger Leigh
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Dear All,

Thank you for all your suggestions.

I compared pxz to pbzip2 (the multithreaded versions of xz and bzip2) and
as far as I could see, bzip2 was faster at compression and gave slightly
better ratios. The disadvantage of bzip2 was the decompression time or
"test time" if the archive integrity was checked but I think I can live
with that.

Best wishes,

Emmanuel



On 21 March 2016 at 13:40, Roger Leigh <[hidden email]> wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> On 21/03/2016 10:37, Emmanuel Levy wrote:
>
>> I'm wondering if some of you may have suggestions regarding archiving
>> large
>> image datasets?
>>
>> So far I've come up to the conclusion that bzip2 is the most convenient
>> compression solution to use because it's lossless and it works on any type
>> of data (e.g., tif, stacks) and maintains the metadata.
>>
>> If anyone knows of good and easy to implement alternatives I'll be very
>> happy to hear about it.
>>
>
> For archival you might find "xz" results in smaller filesizes
> (https://en.wikipedia.org/wiki/Xz).  It's also a lossless compression
> algorithm
> (
> https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm
> ).
>  I have used xz compression to archive all my own data (as tar.xz)
> without encountering any problems.
>
> Some discussion and benchmarks you might find interesting:
>
>
> http://unix.stackexchange.com/questions/108100/why-are-tar-archive-formats-switching-to-xz-compression-to-replace-bzip2-and-wha
>   https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-comparison/
>   http://tukaani.org/lzma/benchmarks.html
>
> xz has essentially replaced bzip2 for most general uses, the exception
> being plain text (and hence genomic data) for which its compression
> algorithm is optimised.  For images, you'll most likely get better
> compression and faster decompression with xz; the tradeoff being that it
> is a bit slower at compression, but if the goal is the smallest possible
> filesize then it's likely worth the tradeoff.
>
> It would probably be worth testing bzip2 and xz at different compression
> levels to see how much benefit you see with your own data.
>
>
> Kind regards,
> Roger
>
> --
> Dr Roger Leigh -- Open Microscopy Environment
> Wellcome Trust Centre for Gene Regulation and Expression,
> School of Life Sciences, University of Dundee, Dow Street,
> Dundee DD1 5EH Scotland UK   Tel: (01382) 386364
>
> The University of Dundee is a registered Scottish Charity, No: SC015096
>