Re: data storage requirements for NIH records - original videos or just tracks?

Posted by Mike Esterman on
URL: http://confocal-microscopy-list.275.s1.nabble.com/data-storage-requirements-for-NIH-records-original-videos-or-just-tracks-tp7579647p7579658.html

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

To those following this thread,

First I concur with what Tim has written, he is absolutely correct.  When I
was working in the pharmaceutical industry we wrestled with this problem
because storage was much more expensive than it is today and we were just
getting into High Content Imaging, Confocal and small animal CT and MRI
imaging and really struggled with cost/benefit but realized for us a few
10's of thousands vs an FDA citation for violation of their data guidelines
was worth it.  Also after I retired I was hired as a consultant to clear a
scientist of scientific mis-conduct and the case was difficult because much
of the original images had been lost.  This stigma has followed this
scientist for the last 6 years!

I haven't been following cloud storage but at one time Amazon was offering
really cheap storage and take a look at Backblaze
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-chea
p-cloud-storage/

67 TB for $7,867.  Also look at some of the new options for compression,
especially if you can afford to lose a bit of resolution as long as the
position data is retained.  Some of the new lossless algorithms are very
good.

Mike Esterman
Imaging Consultant


-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On
Behalf Of Tim Feinstein
Sent: Wednesday, February 06, 2013 10:25 AM
To: [hidden email]
Subject: Re: data storage requirements for NIH records - original videos or
just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Hi all,

Random compliance inspections for data storage seem to be vanishingly rare.
I would suggest a much greater concern that original data could prove
essential to resolving questions of misconduct, erratum corrections ets.
Journals increasingly follow the lead of JCB and perform manipulation tests
on most or all submitted work, and researchers who have not memorized which
journal forbids which acts (that is to say, most people) will very badly
want their original stuff in case of trouble.  People without their
original(ish) data could have a publication rejected or catastrophically
delayed or, worse, retracted.  For want of a nail, etc.  

IMO it is a huge risk to get rid of unmodified data and the NIH is trying to
help people avoid trouble.  The (decreasing) cost of data storage is
annoying but also a sometimes helpful corrective against performing
experiments with unnecessary complexity (a real temptation when many scopes
can perform multicolor Z series at or near video rate).  In cases where
complexity is very much necessary it seems to me that even a decent RAID
array costs not so much next to the imaging system it is meant to support.  

All the best,


TF

Timothy Feinstein, PhD
Visiting Research Associate
Laboratory for GPCR Biology
Dept. of Pharmacology & Chemical Biology University of Pittsburgh, School of
Medicine BST W1301, 200 Lothrop St.
Pittsburgh, PA  15261

On Feb 6, 2013, at 9:25 AM, Steffen Dietzel wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> I guess you always can argue this case both ways. One example you could
mention in your favor ist that many microscopes (their users) use frame
averaging to reduce noise. So in a sense you throw away the original data in
that case and only a smoothed version is stored.
>
> If the truth is that you can't possibly store the original videos the only
alternative would be to not perform this kind of research, right? The
situation may be different in a couple of years, assuming that storage will
continue to become cheaper over time.
>
> At the end of the day you will have to convince whoever is (potentially)
auditing you. So you might want to get a statement from them for you case to
be on the safe side.

>
> my 2 cents
>
> Steffen
>
>>
>> 06.02.2013 8:32, O'Brien III, E. Timothy ?????:
>>> *****
>>> To join, leave or search the confocal microscopy listserv, go to:
>>> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
>>> *****
>>>
>>> Dear Microscopists-
>>>
>>> Our group has begun using a parallel microscope system to study the
>>> movement
>> of fluorescent beads on cells, or in biofilms, mucus, other biological
fluids.
>> We then track the bead movements and generate MSD (mean squared
>> displacement) curves for each bead.  Each 1 minute video at 60 FPS
>> takes up about a gigabyte of data storage.  Meanwhile the tracks
>> (position/ time) might take several kB for each bead.  We can take 12
>> videos simultaneously, so potentially we are generating 12 gB/minute, a
terabyte every hour and 25 minutes!

>>>
>>> We believe that taking an image at the beginning of tracking, and
>>> keeping the
>> tracking records would be sufficient for us to troubleshoot our data,
>> since we can't possibly store the original videos.  This would let us
>> know where the beads were at the beginning of the video (on the
>> nucleus?  On the glass?) Signatures of "lost beads" or "stuck beads"
>> are easily identified in control experiments.
>>>
>>> We are also considering other intermediate data
>>> reduction-potentially saving
>> parts of the videos throughout the timecourse.  But this is going to
>> be difficult to implement, and keep track of.  Moreover, the
>> reduction is not nearly as high as taking one frame and keeping the
tracking results.
>>>
>>> What is the community's understanding of the requirements for
>>> storing
>> "original" data?  Do we need to keep full videos and spend all our
>> budget on hard drives, or will just the position/time data and an index
frame be enough?

>>>
>>> What other solutions does your group use?
>>>
>>> Thanks very much!
>>>
>>> Tim O'Brien
>>> Computer Integrated Systems for Microscopy and Manipulation UNC
>>> Chapel Hill, North Carolina
>>
>>
>>
>
>
> --
> ------------------------------------------------------------
> Steffen Dietzel, PD Dr. rer. nat
> Ludwig-Maximilians-Universität München Walter-Brendel-Zentrum für
> experimentelle Medizin (WBex) Head of light microscopy
>
> Mail room:
> Marchioninistr. 15, D-81377 München
>
> Building location:
> Marchioninistr. 27,  München-Großhadern