Re: Data Storage

Posted by Zdenek Svindrych-2 on
URL: http://confocal-microscopy-list.275.s1.nabble.com/Data-Storage-tp7585019p7585042.html

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Guy,
ideally, it should work. But in reality each pixel contains noise (remember
the cameras usually output 16 bit values), so if you only store the (sparse)
blinking events and ignore the noise, the compression is no longer lossless.
You cannot use an off-the-shelf lossy compression algorithms, as they will
bias some important parameters in your image, e.g. the noise around the
'light blob' may be used by your localization algorithm.

You can localize all the sparse events and store all possible parameters,
including background intensities and variations, and call this an 'image
compression'. And indeed this should be fine for archiving purposes, if you
show that a reconstructed ('decompressed') image is 'statistically
indistinguishable' from the original. But, as pointed out earlier, you most
likely cannot re-process the "decompressed" images with new localization
algorithms...

But the discussion about archiving "raw" raw data is academical, even
cameras themselves process the image internally (hot pixels, etc)...

zdenek


---------- Původní zpráva ----------
Od: Guy Cox <[hidden email]>
Komu: [hidden email]
Datum: 13. 4. 2016 9:34:31
Předmět: Re: Data Storage

"*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

OK, I don't do stochastic super-resolution but I am familiar with it. It
seems to me (having worked on data compression in the distant past) that
this data, being sparse, should be VERY highly compressible without loss.
Has anyone looked at this? I'd reckon this would make the problem disappear.


Guy

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On
Behalf Of Jorand, Raphael
Sent: Tuesday, 12 April 2016 2:32 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are
usually around 6 GB. So over the year it is a lot but not impossible to
manage. Probably less than 10TB per year.
We thought a lot about not keeping the raw data, but by experience we saw
that we needed sometimes to re-analyze the data, to extract new kind of
information.

Raphael

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On
Behalf Of Kurt Thorn
Sent: Monday, April 11, 2016 9:12 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire -

There's one error in your calculation:

On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I
am not currently doing single point localization microscopy but we plan to
get into it.
>
> I just wonder is there a consensus in the field if the raw images for each
point localization have to be retained for 7 years or are people just
keeping the localization data? When I do these calculations I get the
following so it seems impossible with existing infrastructure to retain the
raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
> image

16 bit = 2 byte, so this is 8 MB per image. Also, we typically do
superresolution imaging on much smaller ROIs, often only 256 x 256 pixels
(on a Nikon N-STORM). Are you sure your users are going to want / need such
large fields of view? 256 x 256 x 2 is 131 KB
>
> 10,000 frames for single molecule imaging and two colour super
> resolution
> 10,000x2x64 = 1.28 TB

10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment

160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB

So if you keep the size of the areas of interest down, the data sizes are
not too bad.

I also think a case could be made for not keeping the raw data; Illumina
sequencers are basically a microscope in a box and they do not keep the raw
image data anymore. I believe they keep some reduced representation of it,
but I am not sure of the details.

Kurt
>
> So I'm guessing people are not keeping the raw data or I made a mistake in
my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
>


--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/


---------------------------------------------------------------------
*SECURITY/CONFIDENTIALITY WARNING:
This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain
information that is privileged, confidential, or exempt from disclosure
under applicable law (e.g., personal health information, research data,
financial information). Because this e-mail has been sent without
encryption, individuals other than the intended recipient may be able to
view the information, forward it to others or tamper with the information
without the knowledge or consent of the sender. If you are not the intended
recipient, or the employee or person responsible for delivering the message
to the intended recipient, any dissemination, distribution or copying of the
communication is strictly prohibited. If you received the communication in
error, please notify the sender immediately by replying to this message and
deleting the message and any accompanying files from your system. If, due to
the security risks, you do not wish to receive further communications via e-
mail, please reply to this message and inform the sender that you do not
wish to receive further e-mail from the sender. (fpc5p)
---------------------------------------------------------------------"