data storage requirements for NIH records - original videos or just tracks?

classic Classic list List threaded Threaded
10 messages Options
Tim O'Brien Sr. Tim O'Brien Sr.
Reply | Threaded
Open this post in threaded view
|

data storage requirements for NIH records - original videos or just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Dear Microscopists-

Our group has begun using a parallel microscope system to study the movement of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.  We then track the bead movements and generate MSD (mean squared displacement) curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte of data storage.  Meanwhile the tracks (position/ time) might take several kB for each bead.  We can take 12 videos simultaneously, so potentially we are generating 12 gB/minute, a terabyte every hour and 25 minutes!

We believe that taking an image at the beginning of tracking, and keeping the tracking records would be sufficient for us to troubleshoot our data, since we can't possibly store the original videos.  This would let us know where the beads were at the beginning of the video (on the nucleus?  On the glass?) Signatures of "lost beads" or "stuck beads" are easily identified in control experiments.

We are also considering other intermediate data reduction-potentially saving parts of the videos throughout the timecourse.  But this is going to be difficult to implement, and keep track of.  Moreover, the reduction is not nearly as high as taking one frame and keeping the tracking results.

What is the community's understanding of the requirements for storing "original" data?  Do we need to keep full videos and spend all our budget on hard drives, or will just the position/time data and an index frame be enough?

What other solutions does your group use?

Thanks very much!

Tim O'Brien
Computer Integrated Systems for Microscopy and Manipulation
UNC Chapel Hill, North Carolina
mcammer mcammer
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

My understanding is that you have to keep the raw data.  Annotating it is a good idea.
I haven't heard of anyone being audited, but don't be surprised when it happens.
________________________________________________________
Michael Cammer, Assistant Research Scientist
Skirball Institute of Biomolecular Medicine
Lab: (212) 263-3208  Cell: (914) 309-3270
Dmitry Sokolov Dmitry Sokolov
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

In reply to this post by Tim O'Brien Sr.
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Hi Tim,

good question! I believe what you are asking relates to the data sampling.

The scientific method is about the reproducibility of the experiments
under the conditions given.
The Technology of Research deals with the sustainability of human
activities:
http://confocal-manawatu.pbworks.com/w/page/48682494/Goals%20of%20Technology%20of%20Research
In your case we should probably talk about the sustainability of your
research.

If the high data sampling rate makes your research unsustainable, it
will not be practical to confirm your results too.
However, if your secondary data (the tracks) are reproducible, you are safe.

Geological expedition is probably a good analogy with your experiment. I
believe that its description is satisfactory when based on the GPS data,
physical samples and photos from the sampling sites. High definition
satellite real-time movies would be probably useful but still not
required. The volume and character of data must be adequate to the
objectives of a problem. Too much of data in your raw images form the
"noise" that is meant to be "filtered" by your particle tracking
algorithms. This is the fundamental problem of scientific
instrumentation as the human/nature interface.

I hope you find it useful.
Other opinions would be highly appreciated.

Published in MIAWiki:
http://confocal-manawatu.pbworks.com/w/page/63370062/Scientific%20Instrumentation%20as%20Human%20-%20Nature%20Interface

Cheers,
Dmitry

*Advanced Knowledge Management*
for *MICROSCOPY *and *Image Analysis *
------------------------------------------------------------------------
*Dmitry Sokolov*, Ph.D.
Mob: *+64 21 063 5382***
[hidden email] <mailto:[hidden email]>



06.02.2013 8:32, O'Brien III, E. Timothy ?????:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> Dear Microscopists-
>
> Our group has begun using a parallel microscope system to study the movement of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.  We then track the bead movements and generate MSD (mean squared displacement) curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte of data storage.  Meanwhile the tracks (position/ time) might take several kB for each bead.  We can take 12 videos simultaneously, so potentially we are generating 12 gB/minute, a terabyte every hour and 25 minutes!
>
> We believe that taking an image at the beginning of tracking, and keeping the tracking records would be sufficient for us to troubleshoot our data, since we can't possibly store the original videos.  This would let us know where the beads were at the beginning of the video (on the nucleus?  On the glass?) Signatures of "lost beads" or "stuck beads" are easily identified in control experiments.
>
> We are also considering other intermediate data reduction-potentially saving parts of the videos throughout the timecourse.  But this is going to be difficult to implement, and keep track of.  Moreover, the reduction is not nearly as high as taking one frame and keeping the tracking results.
>
> What is the community's understanding of the requirements for storing "original" data?  Do we need to keep full videos and spend all our budget on hard drives, or will just the position/time data and an index frame be enough?
>
> What other solutions does your group use?
>
> Thanks very much!
>
> Tim O'Brien
> Computer Integrated Systems for Microscopy and Manipulation
> UNC Chapel Hill, North Carolina
Andreas Bruckbauer Andreas Bruckbauer
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Dear all,

i think this is an important question and also a problem with other microcopy techniques like super-resolution localisation microscopty or SPIM time laps recodings. One aspect is that one might want to reanalise the videos at a later time with an improved tracking algorithm and would need the raw data for this. Lossless compression could help as well as cropping out the important part of the image. But when the amount of data created is just prohibitive one might argue that it should be easer to repeat the experiment than saving all the data. This would mean making sure there will be access to the samples and keeping the microscopes as well as knowledge how to use them...

 

 best wishes

Andreas

 

-----Original Message-----
From: Dmitry Sokolov <[hidden email]>
To: CONFOCALMICROSCOPY <[hidden email]>
Sent: Wed, 6 Feb 2013 8:22
Subject: Re: data storage requirements for NIH records - original videos or just tracks?


*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Hi Tim,

good question! I believe what you are asking relates to the data sampling.

The scientific method is about the reproducibility of the experiments
under the conditions given.
The Technology of Research deals with the sustainability of human
activities:
http://confocal-manawatu.pbworks.com/w/page/48682494/Goals%20of%20Technology%20of%20Research
In your case we should probably talk about the sustainability of your
research.

If the high data sampling rate makes your research unsustainable, it
will not be practical to confirm your results too.
However, if your secondary data (the tracks) are reproducible, you are safe.

Geological expedition is probably a good analogy with your experiment. I
believe that its description is satisfactory when based on the GPS data,
physical samples and photos from the sampling sites. High definition
satellite real-time movies would be probably useful but still not
required. The volume and character of data must be adequate to the
objectives of a problem. Too much of data in your raw images form the
"noise" that is meant to be "filtered" by your particle tracking
algorithms. This is the fundamental problem of scientific
instrumentation as the human/nature interface.

I hope you find it useful.
Other opinions would be highly appreciated.

Published in MIAWiki:
http://confocal-manawatu.pbworks.com/w/page/63370062/Scientific%20Instrumentation%20as%20Human%20-%20Nature%20Interface

Cheers,
Dmitry

*Advanced Knowledge Management*
for *MICROSCOPY *and *Image Analysis *
------------------------------------------------------------------------
*Dmitry Sokolov*, Ph.D.
Mob: *+64 21 063 5382***
[hidden email] <mailto:[hidden email]>



06.02.2013 8:32, O'Brien III, E. Timothy ?????:
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> Dear Microscopists-
>
> Our group has begun using a parallel microscope system to study the movement
of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.  
We then track the bead movements and generate MSD (mean squared displacement)
curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte
of data storage.  Meanwhile the tracks (position/ time) might take several kB
for each bead.  We can take 12 videos simultaneously, so potentially we are
generating 12 gB/minute, a terabyte every hour and 25 minutes!
>
> We believe that taking an image at the beginning of tracking, and keeping the
tracking records would be sufficient for us to troubleshoot our data, since we
can't possibly store the original videos.  This would let us know where the
beads were at the beginning of the video (on the nucleus?  On the glass?)
Signatures of "lost beads" or "stuck beads" are easily identified in control
experiments.
>
> We are also considering other intermediate data reduction-potentially saving
parts of the videos throughout the timecourse.  But this is going to be
difficult to implement, and keep track of.  Moreover, the reduction is not
nearly as high as taking one frame and keeping the tracking results.
>
> What is the community's understanding of the requirements for storing
"original" data?  Do we need to keep full videos and spend all our budget on
hard drives, or will just the position/time data and an index frame be enough?
>
> What other solutions does your group use?
>
> Thanks very much!
>
> Tim O'Brien
> Computer Integrated Systems for Microscopy and Manipulation
> UNC Chapel Hill, North Carolina

 
Steffen Dietzel Steffen Dietzel
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

I guess you always can argue this case both ways. One example you could
mention in your favor ist that many microscopes (their users) use frame
averaging to reduce noise. So in a sense you throw away the original
data in that case and only a smoothed version is stored.

If the truth is that you can't possibly store the original videos the
only alternative would be to not perform this kind of research, right?
The situation may be different in a couple of years, assuming that
storage will continue to become cheaper over time.

At the end of the day you will have to convince whoever is (potentially)
auditing you. So you might want to get a statement from them for you
case to be on the safe side.

my 2 cents

Steffen

>
> 06.02.2013 8:32, O'Brien III, E. Timothy ?????:
>> *****
>> To join, leave or search the confocal microscopy listserv, go to:
>> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
>> *****
>>
>> Dear Microscopists-
>>
>> Our group has begun using a parallel microscope system to study the movement
> of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.
> We then track the bead movements and generate MSD (mean squared displacement)
> curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte
> of data storage.  Meanwhile the tracks (position/ time) might take several kB
> for each bead.  We can take 12 videos simultaneously, so potentially we are
> generating 12 gB/minute, a terabyte every hour and 25 minutes!
>>
>> We believe that taking an image at the beginning of tracking, and keeping the
> tracking records would be sufficient for us to troubleshoot our data, since we
> can't possibly store the original videos.  This would let us know where the
> beads were at the beginning of the video (on the nucleus?  On the glass?)
> Signatures of "lost beads" or "stuck beads" are easily identified in control
> experiments.
>>
>> We are also considering other intermediate data reduction-potentially saving
> parts of the videos throughout the timecourse.  But this is going to be
> difficult to implement, and keep track of.  Moreover, the reduction is not
> nearly as high as taking one frame and keeping the tracking results.
>>
>> What is the community's understanding of the requirements for storing
> "original" data?  Do we need to keep full videos and spend all our budget on
> hard drives, or will just the position/time data and an index frame be enough?
>>
>> What other solutions does your group use?
>>
>> Thanks very much!
>>
>> Tim O'Brien
>> Computer Integrated Systems for Microscopy and Manipulation
>> UNC Chapel Hill, North Carolina
>
>
>


--
------------------------------------------------------------
Steffen Dietzel, PD Dr. rer. nat
Ludwig-Maximilians-Universität München
Walter-Brendel-Zentrum für experimentelle Medizin (WBex)
Head of light microscopy

Mail room:
Marchioninistr. 15, D-81377 München

Building location:
Marchioninistr. 27,  München-Großhadern
mcammer mcammer
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

In reply to this post by Andreas Bruckbauer
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Dear Tim et al.,

Based on meetings I was in during the spring & summer, my employer wants everything saved and IT is moving in the direction of facilitating this. (Right now it is reasonable and practical for us to buy USB HD drives and store them off site; I live 20 miles north of work, so my house is the repository for now.)  Based on a quick Google search this morning and the paragraph quoted below, the question isn't only what the government mandates (which may be contradictory depending where you look), but what your university requires.  Google turns up a number of university specific guidelines.  So, in your case, does UNC have a policy?

Take a look at this vague statement on the Office of Research Integrity web site:
http://ori.dhhs.gov/education/products/RCRintro/c06/3c6.html
"Period of retention. Data should be retained for a reasonable period of time to allow other researchers to check results or to use the data for other purposes. There is, however, no common definition of a reasonable period of time. NIH generally requires that data be retained for 3 years following the submission of the final financial report. Some government programs require retention for up to 7 years. A few universities have adopted data-retention policies that set specific time periods in the same range, that is, between 3 and 7 years. Aside from these specific guidelines, however, there is no comprehensive rule for data retention or, when called for, data destruction."

Regards,
Michael

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Andreas Bruckbauer
Sent: Wednesday, February 06, 2013 3:49 AM
To: [hidden email]
Subject: Re: data storage requirements for NIH records - original videos or just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Dear all,

i think this is an important question and also a problem with other microcopy techniques like super-resolution localisation microscopty or SPIM time laps recodings. One aspect is that one might want to reanalise the videos at a later time with an improved tracking algorithm and would need the raw data for this. Lossless compression could help as well as cropping out the important part of the image. But when the amount of data created is just prohibitive one might argue that it should be easer to repeat the experiment than saving all the data. This would mean making sure there will be access to the samples and keeping the microscopes as well as knowledge how to use them...

 

 best wishes

Andreas

 

-----Original Message-----
From: Dmitry Sokolov <[hidden email]>
To: CONFOCALMICROSCOPY <[hidden email]>
Sent: Wed, 6 Feb 2013 8:22
Subject: Re: data storage requirements for NIH records - original videos or just tracks?


*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Hi Tim,

good question! I believe what you are asking relates to the data sampling.

The scientific method is about the reproducibility of the experiments under the conditions given.
The Technology of Research deals with the sustainability of human
activities:
http://confocal-manawatu.pbworks.com/w/page/48682494/Goals%20of%20Technology%20of%20Research
In your case we should probably talk about the sustainability of your research.

If the high data sampling rate makes your research unsustainable, it will not be practical to confirm your results too.
However, if your secondary data (the tracks) are reproducible, you are safe.

Geological expedition is probably a good analogy with your experiment. I believe that its description is satisfactory when based on the GPS data, physical samples and photos from the sampling sites. High definition satellite real-time movies would be probably useful but still not required. The volume and character of data must be adequate to the objectives of a problem. Too much of data in your raw images form the "noise" that is meant to be "filtered" by your particle tracking algorithms. This is the fundamental problem of scientific instrumentation as the human/nature interface.

I hope you find it useful.
Other opinions would be highly appreciated.

Published in MIAWiki:
http://confocal-manawatu.pbworks.com/w/page/63370062/Scientific%20Instrumentation%20as%20Human%20-%20Nature%20Interface

Cheers,
Dmitry

*Advanced Knowledge Management*
for *MICROSCOPY *and *Image Analysis *
------------------------------------------------------------------------
*Dmitry Sokolov*, Ph.D.
Mob: *+64 21 063 5382***
[hidden email] <mailto:[hidden email]>



06.02.2013 8:32, O'Brien III, E. Timothy ?????:
> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> Dear Microscopists-
>
> Our group has begun using a parallel microscope system to study the
> movement
of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.  
We then track the bead movements and generate MSD (mean squared displacement) curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte of data storage.  Meanwhile the tracks (position/ time) might take several kB for each bead.  We can take 12 videos simultaneously, so potentially we are generating 12 gB/minute, a terabyte every hour and 25 minutes!
>
> We believe that taking an image at the beginning of tracking, and
> keeping the
tracking records would be sufficient for us to troubleshoot our data, since we can't possibly store the original videos.  This would let us know where the beads were at the beginning of the video (on the nucleus?  On the glass?) Signatures of "lost beads" or "stuck beads" are easily identified in control experiments.
>
> We are also considering other intermediate data reduction-potentially
> saving
parts of the videos throughout the timecourse.  But this is going to be difficult to implement, and keep track of.  Moreover, the reduction is not nearly as high as taking one frame and keeping the tracking results.
>
> What is the community's understanding of the requirements for storing
"original" data?  Do we need to keep full videos and spend all our budget on hard drives, or will just the position/time data and an index frame be enough?
>
> What other solutions does your group use?
>
> Thanks very much!
>
> Tim O'Brien
> Computer Integrated Systems for Microscopy and Manipulation UNC Chapel
> Hill, North Carolina

 
Tim Feinstein-2 Tim Feinstein-2
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

In reply to this post by Steffen Dietzel
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Hi all,

Random compliance inspections for data storage seem to be vanishingly rare.  I would suggest a much greater concern that original data could prove essential to resolving questions of misconduct, erratum corrections ets.  Journals increasingly follow the lead of JCB and perform manipulation tests on most or all submitted work, and researchers who have not memorized which journal forbids which acts (that is to say, most people) will very badly want their original stuff in case of trouble.  People without their original(ish) data could have a publication rejected or catastrophically delayed or, worse, retracted.  For want of a nail, etc.  

IMO it is a huge risk to get rid of unmodified data and the NIH is trying to help people avoid trouble.  The (decreasing) cost of data storage is annoying but also a sometimes helpful corrective against performing experiments with unnecessary complexity (a real temptation when many scopes can perform multicolor Z series at or near video rate).  In cases where complexity is very much necessary it seems to me that even a decent RAID array costs not so much next to the imaging system it is meant to support.  

All the best,


TF

Timothy Feinstein, PhD
Visiting Research Associate
Laboratory for GPCR Biology
Dept. of Pharmacology & Chemical Biology
University of Pittsburgh, School of Medicine
BST W1301, 200 Lothrop St.
Pittsburgh, PA  15261

On Feb 6, 2013, at 9:25 AM, Steffen Dietzel wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> I guess you always can argue this case both ways. One example you could mention in your favor ist that many microscopes (their users) use frame averaging to reduce noise. So in a sense you throw away the original data in that case and only a smoothed version is stored.
>
> If the truth is that you can't possibly store the original videos the only alternative would be to not perform this kind of research, right? The situation may be different in a couple of years, assuming that storage will continue to become cheaper over time.
>
> At the end of the day you will have to convince whoever is (potentially) auditing you. So you might want to get a statement from them for you case to be on the safe side.
>
> my 2 cents
>
> Steffen
>
>>
>> 06.02.2013 8:32, O'Brien III, E. Timothy ?????:
>>> *****
>>> To join, leave or search the confocal microscopy listserv, go to:
>>> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
>>> *****
>>>
>>> Dear Microscopists-
>>>
>>> Our group has begun using a parallel microscope system to study the movement
>> of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.
>> We then track the bead movements and generate MSD (mean squared displacement)
>> curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte
>> of data storage.  Meanwhile the tracks (position/ time) might take several kB
>> for each bead.  We can take 12 videos simultaneously, so potentially we are
>> generating 12 gB/minute, a terabyte every hour and 25 minutes!
>>>
>>> We believe that taking an image at the beginning of tracking, and keeping the
>> tracking records would be sufficient for us to troubleshoot our data, since we
>> can't possibly store the original videos.  This would let us know where the
>> beads were at the beginning of the video (on the nucleus?  On the glass?)
>> Signatures of "lost beads" or "stuck beads" are easily identified in control
>> experiments.
>>>
>>> We are also considering other intermediate data reduction-potentially saving
>> parts of the videos throughout the timecourse.  But this is going to be
>> difficult to implement, and keep track of.  Moreover, the reduction is not
>> nearly as high as taking one frame and keeping the tracking results.
>>>
>>> What is the community's understanding of the requirements for storing
>> "original" data?  Do we need to keep full videos and spend all our budget on
>> hard drives, or will just the position/time data and an index frame be enough?
>>>
>>> What other solutions does your group use?
>>>
>>> Thanks very much!
>>>
>>> Tim O'Brien
>>> Computer Integrated Systems for Microscopy and Manipulation
>>> UNC Chapel Hill, North Carolina
>>
>>
>>
>
>
> --
> ------------------------------------------------------------
> Steffen Dietzel, PD Dr. rer. nat
> Ludwig-Maximilians-Universität München
> Walter-Brendel-Zentrum für experimentelle Medizin (WBex)
> Head of light microscopy
>
> Mail room:
> Marchioninistr. 15, D-81377 München
>
> Building location:
> Marchioninistr. 27,  München-Großhadern
Tim Feinstein-2 Tim Feinstein-2
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

In reply to this post by Steffen Dietzel
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Apologies for not addressing the original question!  The best compromise may be to anticipate likely future concerns and store data appropriately.  For example, save one complete image series to show that the tracking algorithm does an acceptable job by whatever standards, then store the first and last image from each subsequent movie to show that it started and finished within acceptable parameters (focus, magnification, type of object chosen for tracking, brightness, signal to noise etc etc).  

cheers,


TF

Timothy Feinstein, PhD
Visiting Research Associate
Laboratory for GPCR Biology
Dept. of Pharmacology & Chemical Biology
University of Pittsburgh, School of Medicine
BST W1301, 200 Lothrop St.
Pittsburgh, PA  15261

On Feb 6, 2013, at 9:25 AM, Steffen Dietzel wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> I guess you always can argue this case both ways. One example you could mention in your favor ist that many microscopes (their users) use frame averaging to reduce noise. So in a sense you throw away the original data in that case and only a smoothed version is stored.
>
> If the truth is that you can't possibly store the original videos the only alternative would be to not perform this kind of research, right? The situation may be different in a couple of years, assuming that storage will continue to become cheaper over time.
>
> At the end of the day you will have to convince whoever is (potentially) auditing you. So you might want to get a statement from them for you case to be on the safe side.
>
> my 2 cents
>
> Steffen
>
>>
>> 06.02.2013 8:32, O'Brien III, E. Timothy ?????:
>>> *****
>>> To join, leave or search the confocal microscopy listserv, go to:
>>> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
>>> *****
>>>
>>> Dear Microscopists-
>>>
>>> Our group has begun using a parallel microscope system to study the movement
>> of fluorescent beads on cells, or in biofilms, mucus, other biological fluids.
>> We then track the bead movements and generate MSD (mean squared displacement)
>> curves for each bead.  Each 1 minute video at 60 FPS takes up about a gigabyte
>> of data storage.  Meanwhile the tracks (position/ time) might take several kB
>> for each bead.  We can take 12 videos simultaneously, so potentially we are
>> generating 12 gB/minute, a terabyte every hour and 25 minutes!
>>>
>>> We believe that taking an image at the beginning of tracking, and keeping the
>> tracking records would be sufficient for us to troubleshoot our data, since we
>> can't possibly store the original videos.  This would let us know where the
>> beads were at the beginning of the video (on the nucleus?  On the glass?)
>> Signatures of "lost beads" or "stuck beads" are easily identified in control
>> experiments.
>>>
>>> We are also considering other intermediate data reduction-potentially saving
>> parts of the videos throughout the timecourse.  But this is going to be
>> difficult to implement, and keep track of.  Moreover, the reduction is not
>> nearly as high as taking one frame and keeping the tracking results.
>>>
>>> What is the community's understanding of the requirements for storing
>> "original" data?  Do we need to keep full videos and spend all our budget on
>> hard drives, or will just the position/time data and an index frame be enough?
>>>
>>> What other solutions does your group use?
>>>
>>> Thanks very much!
>>>
>>> Tim O'Brien
>>> Computer Integrated Systems for Microscopy and Manipulation
>>> UNC Chapel Hill, North Carolina
>>
>>
>>
>
>
> --
> ------------------------------------------------------------
> Steffen Dietzel, PD Dr. rer. nat
> Ludwig-Maximilians-Universität München
> Walter-Brendel-Zentrum für experimentelle Medizin (WBex)
> Head of light microscopy
>
> Mail room:
> Marchioninistr. 15, D-81377 München
>
> Building location:
> Marchioninistr. 27,  München-Großhadern
Mike Esterman Mike Esterman
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

In reply to this post by Tim Feinstein-2
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

To those following this thread,

First I concur with what Tim has written, he is absolutely correct.  When I
was working in the pharmaceutical industry we wrestled with this problem
because storage was much more expensive than it is today and we were just
getting into High Content Imaging, Confocal and small animal CT and MRI
imaging and really struggled with cost/benefit but realized for us a few
10's of thousands vs an FDA citation for violation of their data guidelines
was worth it.  Also after I retired I was hired as a consultant to clear a
scientist of scientific mis-conduct and the case was difficult because much
of the original images had been lost.  This stigma has followed this
scientist for the last 6 years!

I haven't been following cloud storage but at one time Amazon was offering
really cheap storage and take a look at Backblaze
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-chea
p-cloud-storage/

67 TB for $7,867.  Also look at some of the new options for compression,
especially if you can afford to lose a bit of resolution as long as the
position data is retained.  Some of the new lossless algorithms are very
good.

Mike Esterman
Imaging Consultant


-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On
Behalf Of Tim Feinstein
Sent: Wednesday, February 06, 2013 10:25 AM
To: [hidden email]
Subject: Re: data storage requirements for NIH records - original videos or
just tracks?

*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

Hi all,

Random compliance inspections for data storage seem to be vanishingly rare.
I would suggest a much greater concern that original data could prove
essential to resolving questions of misconduct, erratum corrections ets.
Journals increasingly follow the lead of JCB and perform manipulation tests
on most or all submitted work, and researchers who have not memorized which
journal forbids which acts (that is to say, most people) will very badly
want their original stuff in case of trouble.  People without their
original(ish) data could have a publication rejected or catastrophically
delayed or, worse, retracted.  For want of a nail, etc.  

IMO it is a huge risk to get rid of unmodified data and the NIH is trying to
help people avoid trouble.  The (decreasing) cost of data storage is
annoying but also a sometimes helpful corrective against performing
experiments with unnecessary complexity (a real temptation when many scopes
can perform multicolor Z series at or near video rate).  In cases where
complexity is very much necessary it seems to me that even a decent RAID
array costs not so much next to the imaging system it is meant to support.  

All the best,


TF

Timothy Feinstein, PhD
Visiting Research Associate
Laboratory for GPCR Biology
Dept. of Pharmacology & Chemical Biology University of Pittsburgh, School of
Medicine BST W1301, 200 Lothrop St.
Pittsburgh, PA  15261

On Feb 6, 2013, at 9:25 AM, Steffen Dietzel wrote:

> *****
> To join, leave or search the confocal microscopy listserv, go to:
> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
> *****
>
> I guess you always can argue this case both ways. One example you could
mention in your favor ist that many microscopes (their users) use frame
averaging to reduce noise. So in a sense you throw away the original data in
that case and only a smoothed version is stored.
>
> If the truth is that you can't possibly store the original videos the only
alternative would be to not perform this kind of research, right? The
situation may be different in a couple of years, assuming that storage will
continue to become cheaper over time.
>
> At the end of the day you will have to convince whoever is (potentially)
auditing you. So you might want to get a statement from them for you case to
be on the safe side.

>
> my 2 cents
>
> Steffen
>
>>
>> 06.02.2013 8:32, O'Brien III, E. Timothy ?????:
>>> *****
>>> To join, leave or search the confocal microscopy listserv, go to:
>>> http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
>>> *****
>>>
>>> Dear Microscopists-
>>>
>>> Our group has begun using a parallel microscope system to study the
>>> movement
>> of fluorescent beads on cells, or in biofilms, mucus, other biological
fluids.
>> We then track the bead movements and generate MSD (mean squared
>> displacement) curves for each bead.  Each 1 minute video at 60 FPS
>> takes up about a gigabyte of data storage.  Meanwhile the tracks
>> (position/ time) might take several kB for each bead.  We can take 12
>> videos simultaneously, so potentially we are generating 12 gB/minute, a
terabyte every hour and 25 minutes!

>>>
>>> We believe that taking an image at the beginning of tracking, and
>>> keeping the
>> tracking records would be sufficient for us to troubleshoot our data,
>> since we can't possibly store the original videos.  This would let us
>> know where the beads were at the beginning of the video (on the
>> nucleus?  On the glass?) Signatures of "lost beads" or "stuck beads"
>> are easily identified in control experiments.
>>>
>>> We are also considering other intermediate data
>>> reduction-potentially saving
>> parts of the videos throughout the timecourse.  But this is going to
>> be difficult to implement, and keep track of.  Moreover, the
>> reduction is not nearly as high as taking one frame and keeping the
tracking results.
>>>
>>> What is the community's understanding of the requirements for
>>> storing
>> "original" data?  Do we need to keep full videos and spend all our
>> budget on hard drives, or will just the position/time data and an index
frame be enough?

>>>
>>> What other solutions does your group use?
>>>
>>> Thanks very much!
>>>
>>> Tim O'Brien
>>> Computer Integrated Systems for Microscopy and Manipulation UNC
>>> Chapel Hill, North Carolina
>>
>>
>>
>
>
> --
> ------------------------------------------------------------
> Steffen Dietzel, PD Dr. rer. nat
> Ludwig-Maximilians-Universität München Walter-Brendel-Zentrum für
> experimentelle Medizin (WBex) Head of light microscopy
>
> Mail room:
> Marchioninistr. 15, D-81377 München
>
> Building location:
> Marchioninistr. 27,  München-Großhadern
mcammer mcammer
Reply | Threaded
Open this post in threaded view
|

Re: data storage requirements for NIH records - original videos or just tracks?

In reply to this post by mcammer
*****
To join, leave or search the confocal microscopy listserv, go to:
http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy
*****

While taking annual compliance training I came across this policy statement by my employer.  No need to read it if you trust my synopsis which is essentially that all data must be kept for a minimum of three or five years following the completion of a project unless the gov't or funding agency requires longer.
Regards,
Michael




===============================
Policy on Retention of and Access to Research Data
Issue Date: March 13, 2009
Contents:
I. Purpose and Scope
II. Policy Statements Regarding Ownership, Record Retention Responsibilities of Principal
Investigator, Access, and Disposition of Research Data Upon Departure
III. Notice to Retain Research Data
IV. Record Retention Responsibilities of the Medical Center
V. Administration of Policy
I. Purpose and Scope
Maintaining accurate and appropriate research records is essential for any research project. It
is necessary to support and substantiate funding, to protect intellectual property rights, to
facilitate management of the NYU Langone Medical Center's research program, to ensure
compliance with federal regulations and to address questions regarding research projects.
This Policy sets forth the rights and responsibilities of the Medical Center, the Principal
Investigator and all other investigators with respect to ownership, access, use and maintenance
of original Research Data created in connection with the design, conduct or reporting of
research performed at or under the auspices of NYU Langone Medical Center ("Research"). It
applies to all Medical Center faculty, staff, students, postdoctoral fellows, residents, trainees,
visiting scholars, and any other persons involved in the design, conduct or reporting of
Research, regardless of funding source or location.
For purposes of this Policy, "Research Data" means any recorded, retrievable information,
necessary for the reconstruction and evaluation of reported results of Research and the events
and processes leading to those results, regardless of the physical form or media, as well as the
materials and products generated by the Research. It includes research notes, laboratory
notebooks, case history records, study protocols, samples of chemicals and materials
synthesized during research, filed specimens, samples of human or animal tissue, tissue
databases, voucher specimens, computer files or other electronic data, software programs and
databases (including documentation thereof) and video and audio tapes, as well as the
synthetic compounds, organisms, cell lines, viruses, cell products, cloned DNA, DNA
sequences, mapping information, crystallographic coordinates, plants, animals, and
spectroscopic data generated as a result of the Research.
II. Policy Statements
A. Ownership
Research Data belong to the School of Medicine unless the School expressly waives ownership
rights under the applicable grant, contract, or agreement with a sponsor ("Sponsored Research
Agreement"), in which event the provisions of the Sponsored Research Agreement shall control.
2
B. Record Retention Responsibilities of Principal Investigator
The Principal Investigator is responsible for the collection, management and retention of
Research Data. The Principal Investigator shall adopt an orderly system of data organization
which dates the records being retained, and communicate the system to all research
participants. Particularly for long-term projects, the Principal Investigator shall implement
procedures for the protection of essential Research Data in the event of a natural disaster or
other emergency. Research Data shall be retained in the unit where they are produced, unless
specific permission to do otherwise is granted by the Department Chair.
All Research Data shall be retained for the longer of (i) three years after the final project closeout
or (ii) five years after the final reporting or publication of a project except as follows:
1. Research Data arising out of sponsored research must be retained for the time
period specified in the Sponsored Research Agreement;
2. Research Data relating to projects subject to the review of the Institutional Review
Board (IRB) must be retained until five (5) years after the completion of the project;
3. Research Data that incorporates Protected Health Information (PHI) or other
pertinent human subject information (e.g., medical records, protocols, case history
forms, progress reports and final reports) must be retained for the period mandated
by New York State law (six years from date of discharge or three years after the
patient's age of majority (18 years), whichever is longer, or at least six years after
death);
4. Research Data relating to clinical trials involving an investigational drug or device
must be retained until two (2) years following the date the applicable FDA marketing
application is approved or, where the investigation is discontinued, two years from
the date that the FDA is notified that no marketing application will be filed; and
5. Research Data relating to a student project must be retained at least until the degree
is granted or it is clear that the student has abandoned the work.
6. Research Data in the form of human or animal tissues in institutional tissue banks
shall be retained so long as such tissues have commercial or scientific value.
Research Data must not be destroyed or altered during the required retention periods unless
written approval for such disposition is received from the Vice Dean for Science. Unless you
have been notified by Medical Center administration not to destroy Research Data (see Section
III below), the Research Data may be destroyed at the discretion of the Principal Investigator
and his/her department or laboratory following the applicable specified retention period.
If the retention obligations cannot be carried out by the Principal Investigator (e.g., the death or
disability of the Principal Investigator), the Department Chair shall assume responsibility for the
Research Data or shall appoint a successor investigator to carry out the obligations.
C. Access
The Medical Center has the right of access to the Research Data regardless of the location of
the responsible Principal Investigator. If the Medical Center believes it necessary for safekeeping,
the Medical Center may take physical custody of any Research Data in a manner
3
specified by the Vice Dean for Science. Government officials and research sponsors shall have
access to Research Data to the degree specified in the Sponsor Research Agreement. Medical
Center Investigators associated with a research project shall also have the right to review all
records of Research Data relating to the project.
D. Disposition of Research Data upon Departure
Original Research Data must be retained at the Medical Center unless transfer of the Research
Data with the Principal Investigator is authorized by the Vice Dean for Science. Any investigator
who leaves the Medical Center may, to the extent feasible, take copies of the Research Data in
which the investigator is involved. If permitted by the applicable Sponsored Research
Agreement, the Vice Dean for Science may grant approval for transfer of original Research Data
to the Principal Investigator's new institution, provided that the new institution executes a written
agreement with the School of Medicine whereby the new institution (i) acknowledges the School
of Medicine's continued ownership of, and accepts custodial responsibilities for, the Research
Data, (ii) grants access to the Research Data to the Medical Center and any other party with a
right to access under this Policy, and (iii) agrees to return the original Research Data to the
Medical Center upon request, and the Principal Investigator agrees in writing with the School of
Medicine to comply with this Policy.
III. Notice to Retain Research Data
There may be situations where Research Data must be retained for periods beyond the time
frames specified in Section II(B) above. If the Medical Center gives the Principal Investigator
notice directing the Principal Investigator to segregate and retain Research Data for any reason,
including those noted below, or the Principal Investigator is independently aware of any of the
following events, the Principal Investigator shall retain such Research Data until further notice.
1. An investigation or audit (conducted either internally or by a governmental
agency), lawsuit, administrative proceeding or other form of legal process;
2. Research Data is required to obtain, protect and or defend intellectual property
rights resulting from the Research; and
3. An allegation of financial or scientific misconduct or conflict of interest has been
made.
IV. Record Retention Responsibilities of the Medical Center
As the owner of the Research Data, the Medical Center will assert its rights with respect to
Research Data in order to assure compliance with regulatory and contractual requirements. The
Medical Center will maintain financial records, supporting documents and all other records
pertinent to an award for Research for the periods required by OMB Circular A-110, Sect. 53
and this Policy.
V. Administration of Policy
Any disputes arising out of this Policy shall be adjudicated by the Vice Dean for Science. Any
questions relating to this Policy should be directed to the Office of Legal Counsel or the Office of
Research Compliance.