Data Storage

classic Classic list List threaded Threaded
14 messages Options
Claire Brown Claire Brown
Reply | Threaded
Open this post in threaded view
|

Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.

I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.

2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per image

10,000 frames for single molecule imaging and two colour super resolution
10,000x2x64 = 1.28 TB

4 conditions, 10 images per condition, experiment done in triplicate
1.28 TBx4x10x3 = 154 TB per experiment

So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.

Looking forward to some feedback!

Sincerely,

Claire
Kurt Thorn Kurt Thorn
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire -

There's one error in your calculation:

On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.
>
> I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per image

16 bit = 2 byte, so this is 8 MB per image.  Also, we typically do
superresolution imaging on much smaller ROIs, often only 256 x 256
pixels (on a Nikon N-STORM). Are you sure your users are going to want /
need such large fields of view?  256 x 256 x 2 is 131 KB
>
> 10,000 frames for single molecule imaging and two colour super resolution
> 10,000x2x64 = 1.28 TB

10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment

160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB

So if you keep the size of the areas of interest down, the data sizes
are not too bad.

I also think a case could be made for not keeping the raw data; Illumina
sequencers are basically a microscope in a box and they do not keep the
raw image data anymore. I believe they keep some reduced representation
of it, but I am not sure of the details.

Kurt
>
> So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
>


--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/
mcammer mcammer
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by Claire Brown
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

We are very fortunate to have institutional support of oodles of space.  This is necessary because new imaging instruments can generate up to 2 TB of data in an overnight run.  

However, speed is a problem.  The network infrastructure was built years ago so it takes 6-8 hrs to transfer a TB to or from the server.  Recently this has been shortened to 3-4 hours, but still feels long.  IT support is working hard to get the file transfer faster, but want to point out that the network infrastructure needs to match the data sizes.

=========================================================================
 Michael Cammer, Microscopy Core & Skirball Institute, NYU Langone Medical Center
                      Cell:  914-309-3270     ** Office: Skirball 2nd Floor main office, back right **
          http://ocs.med.nyu.edu/microscopy & http://microscopynotes.com/


-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Claire Brown, Dr.
Sent: Monday, April 11, 2016 11:27 AM
To: [hidden email]
Subject: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.umn.edu_cgi-2Dbin_wa-3FA0-3Dconfocalmicroscopy&d=CwIFAg&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=oU_05LztNstAydlbm5L5GDu_vAdjXk3frDLx_CqKkuo&m=3y6syIJd7MEBDtd_jO_IbwpmCCOmpNTdNPlaUrtAanI&s=k47F_3XHSRP5tl1Y5eVuAD4mbKoAHZjOUBSekJDy_E0&e= 
Post images on https://urldefense.proofpoint.com/v2/url?u=http-3A__www.imgur.com&d=CwIFAg&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=oU_05LztNstAydlbm5L5GDu_vAdjXk3frDLx_CqKkuo&m=3y6syIJd7MEBDtd_jO_IbwpmCCOmpNTdNPlaUrtAanI&s=PvdK4lIR6akId-H0qR0mgNOPJENX5z3W7LGlmlrUiA0&e=  and include the link in your posting.
*****

I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.

I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.

2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per image

10,000 frames for single molecule imaging and two colour super resolution
10,000x2x64 = 1.28 TB

4 conditions, 10 images per condition, experiment done in triplicate
1.28 TBx4x10x3 = 154 TB per experiment

So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.

Looking forward to some feedback!

Sincerely,

Claire

------------------------------------------------------------
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
=================================
Jorand, Raphael Jorand, Raphael
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by Kurt Thorn
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are usually around 6 GB. So over the year it is a lot but not impossible to manage. Probably less than 10TB  per year.
We thought a lot about not keeping the raw data, but by experience we saw that we needed sometimes to re-analyze the data, to extract new kind of information.

Raphael  

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Kurt Thorn
Sent: Monday, April 11, 2016 9:12 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire -

There's one error in your calculation:

On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.
>
> I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
> image

16 bit = 2 byte, so this is 8 MB per image.  Also, we typically do superresolution imaging on much smaller ROIs, often only 256 x 256 pixels (on a Nikon N-STORM). Are you sure your users are going to want / need such large fields of view?  256 x 256 x 2 is 131 KB
>
> 10,000 frames for single molecule imaging and two colour super
> resolution
> 10,000x2x64 = 1.28 TB

10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment

160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB

So if you keep the size of the areas of interest down, the data sizes are not too bad.

I also think a case could be made for not keeping the raw data; Illumina sequencers are basically a microscope in a box and they do not keep the raw image data anymore. I believe they keep some reduced representation of it, but I am not sure of the details.

Kurt
>
> So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
>


--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/


---------------------------------------------------------------------
*SECURITY/CONFIDENTIALITY WARNING:
This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (fpc5p)
---------------------------------------------------------------------
Joshua Zachary Rappoport Joshua Zachary Rappoport
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by mcammer
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire,

There is certainly a rationale and role for CFs either way, but it needs to be clear whether the data is the responsibility of the PIs or the core
We recently transitioned from the latter to the former and it has greatly improved our peace of mind

Happy to discuss specifics offline, in fact I will be giving a talk about our recent experiences in this area at CTLS 2016 at EMBL in June

:)

Best,

Josh

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Cammer, Michael
Sent: Monday, April 11, 2016 11:29 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

We are very fortunate to have institutional support of oodles of space.  This is necessary because new imaging instruments can generate up to 2 TB of data in an overnight run.  

However, speed is a problem.  The network infrastructure was built years ago so it takes 6-8 hrs to transfer a TB to or from the server.  Recently this has been shortened to 3-4 hours, but still feels long.  IT support is working hard to get the file transfer faster, but want to point out that the network infrastructure needs to match the data sizes.

=========================================================================
 Michael Cammer, Microscopy Core & Skirball Institute, NYU Langone Medical Center
                      Cell:  914-309-3270     ** Office: Skirball 2nd Floor main office, back right **
          http://ocs.med.nyu.edu/microscopy & http://microscopynotes.com/


-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Claire Brown, Dr.
Sent: Monday, April 11, 2016 11:27 AM
To: [hidden email]
Subject: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.umn.edu_cgi-2Dbin_wa-3FA0-3Dconfocalmicroscopy&d=CwIFAg&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=oU_05LztNstAydlbm5L5GDu_vAdjXk3frDLx_CqKkuo&m=3y6syIJd7MEBDtd_jO_IbwpmCCOmpNTdNPlaUrtAanI&s=k47F_3XHSRP5tl1Y5eVuAD4mbKoAHZjOUBSekJDy_E0&e= 
Post images on https://urldefense.proofpoint.com/v2/url?u=http-3A__www.imgur.com&d=CwIFAg&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=oU_05LztNstAydlbm5L5GDu_vAdjXk3frDLx_CqKkuo&m=3y6syIJd7MEBDtd_jO_IbwpmCCOmpNTdNPlaUrtAanI&s=PvdK4lIR6akId-H0qR0mgNOPJENX5z3W7LGlmlrUiA0&e=  and include the link in your posting.
*****

I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.

I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.

2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per image

10,000 frames for single molecule imaging and two colour super resolution
10,000x2x64 = 1.28 TB

4 conditions, 10 images per condition, experiment done in triplicate
1.28 TBx4x10x3 = 154 TB per experiment

So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.

Looking forward to some feedback!

Sincerely,

Claire

------------------------------------------------------------
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
=================================
Reece, Jeff (NIH/NIDDK) [E] Reece, Jeff (NIH/NIDDK) [E]
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by Jorand, Raphael
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

I agree with Raphael on the practical side.  Currently the cost of storing a TB of data for one year is typically a fraction of the cost of the rest of the experiment.  

I disagree with Illumina's policy on ethical grounds, but perhaps it really was too costly to store the raw data when their product was developed.

Any projects funded through USA DHHS are required to keep all relevant raw data for at least 3 years.
https://ori.hhs.gov/education/products/clinicaltools/data.pdf  (see p.16)

Kind Regards,
Jeff

-----Original Message-----
From: Jorand, Raphael [mailto:[hidden email]]
Sent: Monday, April 11, 2016 12:32 PM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are usually around 6 GB. So over the year it is a lot but not impossible to manage. Probably less than 10TB  per year.
We thought a lot about not keeping the raw data, but by experience we saw that we needed sometimes to re-analyze the data, to extract new kind of information.

Raphael  

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Kurt Thorn
Sent: Monday, April 11, 2016 9:12 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire -

There's one error in your calculation:

On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.
>
> I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
> image

16 bit = 2 byte, so this is 8 MB per image.  Also, we typically do superresolution imaging on much smaller ROIs, often only 256 x 256 pixels (on a Nikon N-STORM). Are you sure your users are going to want / need such large fields of view?  256 x 256 x 2 is 131 KB
>
> 10,000 frames for single molecule imaging and two colour super
> resolution
> 10,000x2x64 = 1.28 TB

10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment

160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB

So if you keep the size of the areas of interest down, the data sizes are not too bad.

I also think a case could be made for not keeping the raw data; Illumina sequencers are basically a microscope in a box and they do not keep the raw image data anymore. I believe they keep some reduced representation of it, but I am not sure of the details.

Kurt
>
> So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
>


--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/


---------------------------------------------------------------------
*SECURITY/CONFIDENTIALITY WARNING:
This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (fpc5p)
---------------------------------------------------------------------
Claire Brown Claire Brown
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by Claire Brown
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Thank you everyone for the feedback. I'll update my calculations for 16-bit
= 2 byte. That helps a bit.

We will propose to put fiber optic on all connections but it remains to be
seen who will pay for that.

I remember when I started in my facility and we had a brand new $750k
confocal but no one could pay $100 to replace the beat up old lab stool they
had at the microscope!

We have always had a policy that the core assumes no responsibility for
image data but we also try to offer a service and make sure our users and
our institution keeps up with the needs. Not so easy in a field that is
changing so quickly.

I am going to keep the 2000x2000 pixels in the calculations. I'm also doing
similar calculations for light sheet, HCS and slide scanners so I'll stick
to one image size. I like worst case scenarios too. Everyone is always
pleased if things are a bit better than the calculations. Same as budgets. I
always over estimate expenses and over estimate revenues.

I'm happy to share my final document with anyone who might be interested.

Sincerely,

Claire
Smith, Benjamin E. Smith, Benjamin E.
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

One other consideration, if the data transfer is the bottleneck, then it will likely be faster if you compress the data before transferring it.  Also, depending on the computer and server, FTP can be significantly faster than SFTP (keeping in mind that your login and password will be transmitted unencrypted).  One other trick we do for VERY large data sets that we want to move quickly is to use a dedicated USB3 thumb drive (or for TB+ data, an internal hard drive) to move data.  

-Ben Smith

________________________________________
From: Confocal Microscopy List <[hidden email]> on behalf of Claire Brown <[hidden email]>
Sent: Monday, April 11, 2016 12:27 PM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Thank you everyone for the feedback. I'll update my calculations for 16-bit
= 2 byte. That helps a bit.

We will propose to put fiber optic on all connections but it remains to be
seen who will pay for that.

I remember when I started in my facility and we had a brand new $750k
confocal but no one could pay $100 to replace the beat up old lab stool they
had at the microscope!

We have always had a policy that the core assumes no responsibility for
image data but we also try to offer a service and make sure our users and
our institution keeps up with the needs. Not so easy in a field that is
changing so quickly.

I am going to keep the 2000x2000 pixels in the calculations. I'm also doing
similar calculations for light sheet, HCS and slide scanners so I'll stick
to one image size. I like worst case scenarios too. Everyone is always
pleased if things are a bit better than the calculations. Same as budgets. I
always over estimate expenses and over estimate revenues.

I'm happy to share my final document with anyone who might be interested.

Sincerely,

Claire
Reece, Jeff (NIH/NIDDK) [E] Reece, Jeff (NIH/NIDDK) [E]
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

For very large data sets, consider investing into SSD HDs and 10 Gb ethernet on your computers, and Cat6 ethernet cabling.
Latest technology is roughly an order of magnitude faster than the USB3 thumb drives.
But I like the USB external HDs too.

Cheers,
Jeff

________________________________________
From: Smith, Benjamin E. [[hidden email]]
Sent: Monday, April 11, 2016 1:48 PM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

One other consideration, if the data transfer is the bottleneck, then it will likely be faster if you compress the data before transferring it.  Also, depending on the computer and server, FTP can be significantly faster than SFTP (keeping in mind that your login and password will be transmitted unencrypted).  One other trick we do for VERY large data sets that we want to move quickly is to use a dedicated USB3 thumb drive (or for TB+ data, an internal hard drive) to move data.

-Ben Smith

________________________________________
From: Confocal Microscopy List <[hidden email]> on behalf of Claire Brown <[hidden email]>
Sent: Monday, April 11, 2016 12:27 PM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Thank you everyone for the feedback. I'll update my calculations for 16-bit
= 2 byte. That helps a bit.

We will propose to put fiber optic on all connections but it remains to be
seen who will pay for that.

I remember when I started in my facility and we had a brand new $750k
confocal but no one could pay $100 to replace the beat up old lab stool they
had at the microscope!

We have always had a policy that the core assumes no responsibility for
image data but we also try to offer a service and make sure our users and
our institution keeps up with the needs. Not so easy in a field that is
changing so quickly.

I am going to keep the 2000x2000 pixels in the calculations. I'm also doing
similar calculations for light sheet, HCS and slide scanners so I'll stick
to one image size. I like worst case scenarios too. Everyone is always
pleased if things are a bit better than the calculations. Same as budgets. I
always over estimate expenses and over estimate revenues.

I'm happy to share my final document with anyone who might be interested.

Sincerely,

Claire
jerie jerie
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by Claire Brown
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

-commercial content warning-

Dear Listers,

it might be interesting for some of you, that the company Acquifer -an
EMBL/KIT spinoff- has developed a data solution for STORM & SPIM
applications and screening workflows.

Key features are:
- write rates of > 800MBytes/sec. for streaming from 2 external cameras
onto a RAID 5/6 array.
- 128GByte RAM 3,5 GHz HEXcore CPU,GTX 970 gfx card for GPU
processing/deconvolution/data compression.
- software defined networking router powered by OCEDO (now RIVERBED) for
secure remote viewing/remote desktop applications.
- 47TByte intermediate storage.

The system runs Windows Server 2012r2 and has been tested in
various microscope facilites, screening labs, with many open source
image/data processing packages (e.g. Fiji, ilastik, Cellprofiler, KNIME
etc.). It helps to drastically reduce network traffic.

More info is available at https://www.acquifer.de/bigdata-logistics/

I am happy to answer questions or forward you to the respective tech
specialists.

Disclaimer: I have been doing paid consultancy work for Acquifer.

Cheers, Jens

Dr. Jens Rietdorf, visiting scientist @ center for technological
development in health CDTS, Oswaldo Cruz Foundation Fiocruz, Rio de Janeiro
Brasil.

On Mon, Apr 11, 2016 at 12:27 PM, Claire Brown, Dr. <[hidden email]>
wrote:
>
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I
am not currently doing single point localization microscopy but we plan to
get into it.
>
> I just wonder is there a consensus in the field if the raw images for
each point localization have to be retained for 7 years or are people just
keeping the localization data? When I do these calculations I get the
following so it seems impossible with existing infrastructure to retain the
raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
image
>
> 10,000 frames for single molecule imaging and two colour super resolution
> 10,000x2x64 = 1.28 TB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment
>
> So I'm guessing people are not keeping the raw data or I made a mistake
in my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
Guy Cox-2 Guy Cox-2
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

In reply to this post by Jorand, Raphael
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

OK, I don't do stochastic super-resolution but I am familiar with it.  It seems to me (having worked on data compression in the distant past) that this data, being sparse, should be VERY highly compressible without loss.  Has anyone looked at this?  I'd reckon this would make the problem disappear.  

                                                Guy

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Jorand, Raphael
Sent: Tuesday, 12 April 2016 2:32 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are usually around 6 GB. So over the year it is a lot but not impossible to manage. Probably less than 10TB  per year.
We thought a lot about not keeping the raw data, but by experience we saw that we needed sometimes to re-analyze the data, to extract new kind of information.

Raphael  

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Kurt Thorn
Sent: Monday, April 11, 2016 9:12 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire -

There's one error in your calculation:

On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I am not currently doing single point localization microscopy but we plan to get into it.
>
> I just wonder is there a consensus in the field if the raw images for each point localization have to be retained for 7 years or are people just keeping the localization data? When I do these calculations I get the following so it seems impossible with existing infrastructure to retain the raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
> image

16 bit = 2 byte, so this is 8 MB per image.  Also, we typically do superresolution imaging on much smaller ROIs, often only 256 x 256 pixels (on a Nikon N-STORM). Are you sure your users are going to want / need such large fields of view?  256 x 256 x 2 is 131 KB
>
> 10,000 frames for single molecule imaging and two colour super
> resolution
> 10,000x2x64 = 1.28 TB

10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment

160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB

So if you keep the size of the areas of interest down, the data sizes are not too bad.

I also think a case could be made for not keeping the raw data; Illumina sequencers are basically a microscope in a box and they do not keep the raw image data anymore. I believe they keep some reduced representation of it, but I am not sure of the details.

Kurt
>
> So I'm guessing people are not keeping the raw data or I made a mistake in my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
>


--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/


---------------------------------------------------------------------
*SECURITY/CONFIDENTIALITY WARNING:
This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (fpc5p)
---------------------------------------------------------------------
Zdenek Svindrych-2 Zdenek Svindrych-2
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Guy,
ideally, it should work. But in reality each pixel contains noise (remember
the cameras usually output 16 bit values), so if you only store the (sparse)
blinking events and ignore the noise, the compression is no longer lossless.
You cannot use an off-the-shelf lossy compression algorithms, as they will
bias some important parameters in your image, e.g. the noise around the
'light blob' may be used by your localization algorithm.

You can localize all the sparse events and store all possible parameters,
including background intensities and variations, and call this an 'image
compression'. And indeed this should be fine for archiving purposes, if you
show that a reconstructed ('decompressed') image is 'statistically
indistinguishable' from the original. But, as pointed out earlier, you most
likely cannot re-process the "decompressed" images with new localization
algorithms...

But the discussion about archiving "raw" raw data is academical, even
cameras themselves process the image internally (hot pixels, etc)...

zdenek


---------- Původní zpráva ----------
Od: Guy Cox <[hidden email]>
Komu: [hidden email]
Datum: 13. 4. 2016 9:34:31
Předmět: Re: Data Storage

"*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

OK, I don't do stochastic super-resolution but I am familiar with it. It
seems to me (having worked on data compression in the distant past) that
this data, being sparse, should be VERY highly compressible without loss.
Has anyone looked at this? I'd reckon this would make the problem disappear.


Guy

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On
Behalf Of Jorand, Raphael
Sent: Tuesday, 12 April 2016 2:32 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are
usually around 6 GB. So over the year it is a lot but not impossible to
manage. Probably less than 10TB per year.
We thought a lot about not keeping the raw data, but by experience we saw
that we needed sometimes to re-analyze the data, to extract new kind of
information.

Raphael

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On
Behalf Of Kurt Thorn
Sent: Monday, April 11, 2016 9:12 AM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi Claire -

There's one error in your calculation:

On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> I'm working on some numbers for cyberinfrastructure for Compute Canada. I
am not currently doing single point localization microscopy but we plan to
get into it.
>
> I just wonder is there a consensus in the field if the raw images for each
point localization have to be retained for 7 years or are people just
keeping the localization data? When I do these calculations I get the
following so it seems impossible with existing infrastructure to retain the
raw data.
>
> 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
> image

16 bit = 2 byte, so this is 8 MB per image. Also, we typically do
superresolution imaging on much smaller ROIs, often only 256 x 256 pixels
(on a Nikon N-STORM). Are you sure your users are going to want / need such
large fields of view? 256 x 256 x 2 is 131 KB
>
> 10,000 frames for single molecule imaging and two colour super
> resolution
> 10,000x2x64 = 1.28 TB

10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
>
> 4 conditions, 10 images per condition, experiment done in triplicate
> 1.28 TBx4x10x3 = 154 TB per experiment

160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB

So if you keep the size of the areas of interest down, the data sizes are
not too bad.

I also think a case could be made for not keeping the raw data; Illumina
sequencers are basically a microscope in a box and they do not keep the raw
image data anymore. I believe they keep some reduced representation of it,
but I am not sure of the details.

Kurt
>
> So I'm guessing people are not keeping the raw data or I made a mistake in
my calculations.
>
> Looking forward to some feedback!
>
> Sincerely,
>
> Claire
>


--
Kurt Thorn
Associate Professor
Director, Nikon Imaging Center
http://thornlab.ucsf.edu/
http://nic.ucsf.edu/blog/


---------------------------------------------------------------------
*SECURITY/CONFIDENTIALITY WARNING:
This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain
information that is privileged, confidential, or exempt from disclosure
under applicable law (e.g., personal health information, research data,
financial information). Because this e-mail has been sent without
encryption, individuals other than the intended recipient may be able to
view the information, forward it to others or tamper with the information
without the knowledge or consent of the sender. If you are not the intended
recipient, or the employee or person responsible for delivering the message
to the intended recipient, any dissemination, distribution or copying of the
communication is strictly prohibited. If you received the communication in
error, please notify the sender immediately by replying to this message and
deleting the message and any accompanying files from your system. If, due to
the security risks, you do not wish to receive further communications via e-
mail, please reply to this message and inform the sender that you do not
wish to receive further e-mail from the sender. (fpc5p)
---------------------------------------------------------------------"
lechristophe lechristophe
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

There is a "sort of" way to compress SMLM data, which is to pre-process the
image to detect peaks in-camera, only send these regions to the computer
and store them. This was implemented in this paper:
https://www.osapublishing.org/ol/abstract.cfm?uri=ol-38-11-1769

I just saw that the new Photometrics sCMOS camera has that function they
are calling "PrimeLocate" (see
http://www.photometrics.com/products/datasheets/Prime-Datasheet.pdf) that
can do up to 500 peak regions/frame apparently.

No commercial interest and I haven't tested this, but it looks interesting.
Of course it is not lossless, and I'd be wary about long-term support and
usability of raw data. But sCMOS SMLM really generates a huge amount of
data!

Christophe

--
Christophe Leterrier
Chercheur
Equipe Architecture des Domaines Axonaux
CRN2M CNRS UMR 7286 - Aix Marseille Université


On Wed, Apr 13, 2016 at 5:28 PM, Zdenek Svindrych <[hidden email]> wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Hi Guy,
> ideally, it should work. But in reality each pixel contains noise (remember
> the cameras usually output 16 bit values), so if you only store the
> (sparse)
> blinking events and ignore the noise, the compression is no longer
> lossless.
> You cannot use an off-the-shelf lossy compression algorithms, as they will
> bias some important parameters in your image, e.g. the noise around the
> 'light blob' may be used by your localization algorithm.
>
> You can localize all the sparse events and store all possible parameters,
> including background intensities and variations, and call this an 'image
> compression'. And indeed this should be fine for archiving purposes, if you
> show that a reconstructed ('decompressed') image is 'statistically
> indistinguishable' from the original. But, as pointed out earlier, you most
> likely cannot re-process the "decompressed" images with new localization
> algorithms...
>
> But the discussion about archiving "raw" raw data is academical, even
> cameras themselves process the image internally (hot pixels, etc)...
>
> zdenek
>
>
> ---------- Původní zpráva ----------
> Od: Guy Cox <[hidden email]>
> Komu: [hidden email]
> Datum: 13. 4. 2016 9:34:31
> Předmět: Re: Data Storage
>
> "*****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> OK, I don't do stochastic super-resolution but I am familiar with it. It
> seems to me (having worked on data compression in the distant past) that
> this data, being sparse, should be VERY highly compressible without loss.
> Has anyone looked at this? I'd reckon this would make the problem
> disappear.
>
>
> Guy
>
> -----Original Message-----
> From: Confocal Microscopy List [mailto:[hidden email]]
> On
> Behalf Of Jorand, Raphael
> Sent: Tuesday, 12 April 2016 2:32 AM
> To: [hidden email]
> Subject: Re: Data Storage
>
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Hi,
>
> I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are
> usually around 6 GB. So over the year it is a lot but not impossible to
> manage. Probably less than 10TB per year.
> We thought a lot about not keeping the raw data, but by experience we saw
> that we needed sometimes to re-analyze the data, to extract new kind of
> information.
>
> Raphael
>
> -----Original Message-----
> From: Confocal Microscopy List [mailto:[hidden email]]
> On
> Behalf Of Kurt Thorn
> Sent: Monday, April 11, 2016 9:12 AM
> To: [hidden email]
> Subject: Re: Data Storage
>
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Hi Claire -
>
> There's one error in your calculation:
>
> On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:
> > *****
> > To join, leave or search the confocal microscopy listserv, go to:
> > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> > Post images on http://www.imgur.com and include the link in your
> posting.
> > *****
> >
> > I'm working on some numbers for cyberinfrastructure for Compute Canada. I
> am not currently doing single point localization microscopy but we plan to
> get into it.
> >
> > I just wonder is there a consensus in the field if the raw images for
> each
> point localization have to be retained for 7 years or are people just
> keeping the localization data? When I do these calculations I get the
> following so it seems impossible with existing infrastructure to retain the
> raw data.
> >
> > 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB per
> > image
>
> 16 bit = 2 byte, so this is 8 MB per image. Also, we typically do
> superresolution imaging on much smaller ROIs, often only 256 x 256 pixels
> (on a Nikon N-STORM). Are you sure your users are going to want / need such
> large fields of view? 256 x 256 x 2 is 131 KB
> >
> > 10,000 frames for single molecule imaging and two colour super
> > resolution
> > 10,000x2x64 = 1.28 TB
>
> 10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
> >
> > 4 conditions, 10 images per condition, experiment done in triplicate
> > 1.28 TBx4x10x3 = 154 TB per experiment
>
> 160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB
>
> So if you keep the size of the areas of interest down, the data sizes are
> not too bad.
>
> I also think a case could be made for not keeping the raw data; Illumina
> sequencers are basically a microscope in a box and they do not keep the raw
> image data anymore. I believe they keep some reduced representation of it,
> but I am not sure of the details.
>
> Kurt
> >
> > So I'm guessing people are not keeping the raw data or I made a mistake
> in
> my calculations.
> >
> > Looking forward to some feedback!
> >
> > Sincerely,
> >
> > Claire
> >
>
>
> --
> Kurt Thorn
> Associate Professor
> Director, Nikon Imaging Center
> http://thornlab.ucsf.edu/
> http://nic.ucsf.edu/blog/
>
>
> ---------------------------------------------------------------------
> *SECURITY/CONFIDENTIALITY WARNING:
> This message and any attachments are intended solely for the individual or
> entity to which they are addressed. This communication may contain
> information that is privileged, confidential, or exempt from disclosure
> under applicable law (e.g., personal health information, research data,
> financial information). Because this e-mail has been sent without
> encryption, individuals other than the intended recipient may be able to
> view the information, forward it to others or tamper with the information
> without the knowledge or consent of the sender. If you are not the intended
> recipient, or the employee or person responsible for delivering the message
> to the intended recipient, any dissemination, distribution or copying of
> the
> communication is strictly prohibited. If you received the communication in
> error, please notify the sender immediately by replying to this message and
> deleting the message and any accompanying files from your system. If, due
> to
> the security risks, you do not wish to receive further communications via
> e-
> mail, please reply to this message and inform the sender that you do not
> wish to receive further e-mail from the sender. (fpc5p)
> ---------------------------------------------------------------------"
>
Guy Cox-2 Guy Cox-2
Reply | Threaded
Open this post in threaded view
|

Re: Data Storage

OK, I understand the points made in response to my original post, and noisy images will absolutely not compress with the algorithm I was playing about with 25 years ago.  BUT, if the imaging technique is going to work, the 'signal' pixels must be substantially above the 'noise' pixels.  In a 16-bit file, if the noise is all in the lower 8 bits then we have an automatic minimum 50% compression with no loss.  And many cameras only actually give a 12-bit image, but saved in a 16-bit file.  So we have a 25% saving before we even look at the image data!  In other words, I see plenty of room for lossless compression.  

                        Guy

Guy Cox, Honorary Associate Professor
School of Medical Sciences

Australian Centre for Microscopy and Microanalysis,
Madsen, F09, University of Sydney, NSW 2006

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Christophe Leterrier
Sent: Saturday, 16 April 2016 7:44 PM
To: [hidden email]
Subject: Re: Data Storage

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
Post images on http://www.imgur.com and include the link in your posting.
*****

Hi,

There is a "sort of" way to compress SMLM data, which is to pre-process the image to detect peaks in-camera, only send these regions to the computer and store them. This was implemented in this paper:
https://www.osapublishing.org/ol/abstract.cfm?uri=ol-38-11-1769

I just saw that the new Photometrics sCMOS camera has that function they are calling "PrimeLocate" (see
http://www.photometrics.com/products/datasheets/Prime-Datasheet.pdf) that can do up to 500 peak regions/frame apparently.

No commercial interest and I haven't tested this, but it looks interesting.
Of course it is not lossless, and I'd be wary about long-term support and usability of raw data. But sCMOS SMLM really generates a huge amount of data!

Christophe

--
Christophe Leterrier
Chercheur
Equipe Architecture des Domaines Axonaux CRN2M CNRS UMR 7286 - Aix Marseille Université


On Wed, Apr 13, 2016 at 5:28 PM, Zdenek Svindrych <[hidden email]> wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Hi Guy,
> ideally, it should work. But in reality each pixel contains noise
> (remember the cameras usually output 16 bit values), so if you only
> store the
> (sparse)
> blinking events and ignore the noise, the compression is no longer
> lossless.
> You cannot use an off-the-shelf lossy compression algorithms, as they
> will bias some important parameters in your image, e.g. the noise
> around the 'light blob' may be used by your localization algorithm.
>
> You can localize all the sparse events and store all possible
> parameters, including background intensities and variations, and call
> this an 'image compression'. And indeed this should be fine for
> archiving purposes, if you show that a reconstructed ('decompressed')
> image is 'statistically indistinguishable' from the original. But, as
> pointed out earlier, you most likely cannot re-process the
> "decompressed" images with new localization algorithms...
>
> But the discussion about archiving "raw" raw data is academical, even
> cameras themselves process the image internally (hot pixels, etc)...
>
> zdenek
>
>
> ---------- Původní zpráva ----------
> Od: Guy Cox <[hidden email]>
> Komu: [hidden email]
> Datum: 13. 4. 2016 9:34:31
> Předmět: Re: Data Storage
>
> "*****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> OK, I don't do stochastic super-resolution but I am familiar with it.
> It seems to me (having worked on data compression in the distant past)
> that this data, being sparse, should be VERY highly compressible without loss.
> Has anyone looked at this? I'd reckon this would make the problem
> disappear.
>
>
> Guy
>
> -----Original Message-----
> From: Confocal Microscopy List
> [mailto:[hidden email]]
> On
> Behalf Of Jorand, Raphael
> Sent: Tuesday, 12 April 2016 2:32 AM
> To: [hidden email]
> Subject: Re: Data Storage
>
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Hi,
>
> I agree with Kurt. Our 2-color STORM for 20 000 frames, in 256*256 are
> usually around 6 GB. So over the year it is a lot but not impossible
> to manage. Probably less than 10TB per year.
> We thought a lot about not keeping the raw data, but by experience we
> saw that we needed sometimes to re-analyze the data, to extract new
> kind of information.
>
> Raphael
>
> -----Original Message-----
> From: Confocal Microscopy List
> [mailto:[hidden email]]
> On
> Behalf Of Kurt Thorn
> Sent: Monday, April 11, 2016 9:12 AM
> To: [hidden email]
> Subject: Re: Data Storage
>
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> Post images on http://www.imgur.com and include the link in your posting.
> *****
>
> Hi Claire -
>
> There's one error in your calculation:
>
> On 4/11/2016 8:27 AM, Claire Brown, Dr. wrote:
> > *****
> > To join, leave or search the confocal microscopy listserv, go to:
> > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> > Post images on http://www.imgur.com and include the link in your
> posting.
> > *****
> >
> > I'm working on some numbers for cyberinfrastructure for Compute
> > Canada. I
> am not currently doing single point localization microscopy but we
> plan to get into it.
> >
> > I just wonder is there a consensus in the field if the raw images
> > for
> each
> point localization have to be retained for 7 years or are people just
> keeping the localization data? When I do these calculations I get the
> following so it seems impossible with existing infrastructure to
> retain the raw data.
> >
> > 2000 x 2000 pixel camera and 16-bit images = 2000x2000x16 = 64 MB
> > per image
>
> 16 bit = 2 byte, so this is 8 MB per image. Also, we typically do
> superresolution imaging on much smaller ROIs, often only 256 x 256
> pixels (on a Nikon N-STORM). Are you sure your users are going to want
> / need such large fields of view? 256 x 256 x 2 is 131 KB
> >
> > 10,000 frames for single molecule imaging and two colour super
> > resolution
> > 10,000x2x64 = 1.28 TB
>
> 10000 x 2 x 8 = 160 GB; 10000 x 2 x 131 KB = 2.6 GB
> >
> > 4 conditions, 10 images per condition, experiment done in triplicate
> > 1.28 TBx4x10x3 = 154 TB per experiment
>
> 160 GB x 4 x 10 x 3 = 19 TB; 2.6 GB x 4 x 10 x 3 = 312 GB
>
> So if you keep the size of the areas of interest down, the data sizes
> are not too bad.
>
> I also think a case could be made for not keeping the raw data;
> Illumina sequencers are basically a microscope in a box and they do
> not keep the raw image data anymore. I believe they keep some reduced
> representation of it, but I am not sure of the details.
>
> Kurt
> >
> > So I'm guessing people are not keeping the raw data or I made a
> > mistake
> in
> my calculations.
> >
> > Looking forward to some feedback!
> >
> > Sincerely,
> >
> > Claire
> >
>
>
> --
> Kurt Thorn
> Associate Professor
> Director, Nikon Imaging Center
> http://thornlab.ucsf.edu/
> http://nic.ucsf.edu/blog/
>
>
> ---------------------------------------------------------------------
> *SECURITY/CONFIDENTIALITY WARNING:
> This message and any attachments are intended solely for the
> individual or entity to which they are addressed. This communication
> may contain information that is privileged, confidential, or exempt
> from disclosure under applicable law (e.g., personal health
> information, research data, financial information). Because this
> e-mail has been sent without encryption, individuals other than the
> intended recipient may be able to view the information, forward it to
> others or tamper with the information without the knowledge or consent
> of the sender. If you are not the intended recipient, or the employee
> or person responsible for delivering the message to the intended
> recipient, any dissemination, distribution or copying of the
> communication is strictly prohibited. If you received the
> communication in error, please notify the sender immediately by
> replying to this message and deleting the message and any accompanying
> files from your system. If, due to the security risks, you do not wish
> to receive further communications via
> e-
> mail, please reply to this message and inform the sender that you do
> not wish to receive further e-mail from the sender. (fpc5p)
> ---------------------------------------------------------------------"
>