server for large data files

classic Classic list List threaded Threaded
8 messages Options
Sylvie Le Guyader-2 Sylvie Le Guyader-2
Reply | Threaded
Open this post in threaded view
|

server for large data files

Dear all

 

We often generate data files that are 50-100Gb. I suppose the only solution for us it to set up a server to save/back up our images.

 

Does anyone have any experience with very large amounts of data?

 

Med vänlig hälsning / Best regards

 

Sylvie

 

@@@@@@@@@@@@@@@@@@@@@@@@

Sylvie Le Guyader

Live Cell Imaging Unit

Dept of Biosciences and Nutrition

Karolinska Institutet

Sweden

office: +46 (0)8 608 9240

mobile: +46 (0) 73 733 5008

 

sabarinath radhakrishnan sabarinath radhakrishnan
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

 
Hi Sylvie Le Guyader,
 
The best thing to do is to " RAID " the hard disks of your server.
 
RAID, an acronym for redundant array of independent disks or redundant array of inexpensive disks, is a technology that provides increased storage reliability through redundancy, combining multiple low-cost, less-reliable disk drive components into a logical unit where all drives in the array are interdependent. (-Wikipedia)
 
Although you have many options , I use and would strongly recommend  RAID 1
 
You just need to have  2 or 3 identical hard drives with equal capacity (for eg 2TB + 2TB+ 2TB)

Here data is written identically to multiple disks (a "mirrored set"). Although many implementations create sets of 2 disks, sets may contain 3 or more disks. Array provides fault tolerance from disk errors or failures and continues to operate as long as at least one drive in the mirrored set is functioning. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, as well as a very small performance reduction when writing. Using RAID 1 with a separate controller for each disk is sometimes called duplexing. (-Wikipedia)

Hope this helps. Please check the following link http://en.wikipedia.org/wiki/RAID

Best,
 
Radha

 
On Fri, Jun 18, 2010 at 5:43 PM, Sylvie Le Guyader <[hidden email]> wrote:

Dear all

 

We often generate data files that are 50-100Gb. I suppose the only solution for us it to set up a server to save/back up our images.

 

Does anyone have any experience with very large amounts of data?

 

Med vänlig hälsning / Best regards

 

Sylvie

 

@@@@@@@@@@@@@@@@@@@@@@@@

Sylvie Le Guyader

Live Cell Imaging Unit

Dept of Biosciences and Nutrition

Karolinska Institutet

Sweden

office: +46 (0)8 608 9240

mobile: +46 (0) 73 733 5008

 


Neil Kad Neil Kad
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

Hi Sylvie,

We've been looking into systems, this one is quite appealing:

http://www.drobo.com/

However we have no experience of using it, does anyone else?

Cheers

Neil


Date: Fri, 18 Jun 2010 20:01:06 +0530
From: [hidden email]
Subject: Re: server for large data files
To: [hidden email]

 
Hi Sylvie Le Guyader,
 
The best thing to do is to " RAID " the hard disks of your server.
 
RAID, an acronym for redundant array of independent disks or redundant array of inexpensive disks, is a technology that provides increased storage reliability through redundancy, combining multiple low-cost, less-reliable disk drive components into a logical unit where all drives in the array are interdependent. (-Wikipedia)
 
Although you have many options , I use and would strongly recommend  RAID 1
 
You just need to have  2 or 3 identical hard drives with equal capacity (for eg 2TB + 2TB+ 2TB)
Here data is written identically to multiple disks (a "mirrored set"). Although many implementations create sets of 2 disks, sets may contain 3 or more disks. Array provides fault tolerance from disk errors or failures and continues to operate as long as at least one drive in the mirrored set is functioning. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, as well as a very small performance reduction when writing. Using RAID 1 with a separate controller for each disk is sometimes called duplexing. (-Wikipedia)
Hope this helps. Please check the following link http://en.wikipedia.org/wiki/RAID
Best,
 
Radha

 
On Fri, Jun 18, 2010 at 5:43 PM, Sylvie Le Guyader <[hidden email]> wrote:

Dear all

 

We often generate data files that are 50-100Gb. I suppose the only solution for us it to set up a server to save/back up our images.

 

Does anyone have any experience with very large amounts of data?

 

Med vänlig hälsning / Best regards

 

Sylvie

 

@@@@@@@@@@@@@@@@@@@@@@@@

Sylvie Le Guyader

Live Cell Imaging Unit

Dept of Biosciences and Nutrition

Karolinska Institutet

Sweden

office: +46 (0)8 608 9240

mobile: +46 (0) 73 733 5008

 




Get a free e-mail account with Hotmail. Sign-up now.
Ramshesh, Venkat K Ramshesh, Venkat K
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

Hi Sylvie,

We use this western digital product for our data backup.

http://www.wdc.com/en/products/products.asp?driveid=584

Easy to set up and use.

Best,
Venkat

Venkat Ramshesh, PhD
Bioengineer/Facility Manager
Cell and Molecular Imaging Core
Hollings Cancer Center and Center for Cell Death, Injury and Regeinteneration,
Medical University of South Carolina
QE302
280 Calhoun Street, MSC 140
Charleston, SC 29425

Ph: 843-792-3530
Fax: 843-792-8436
E-mail: [hidden email]

-----Original Message-----
From: Confocal Microscopy List [mailto:[hidden email]] On Behalf Of Neil Kad
Sent: Friday, June 18, 2010 10:39 AM
To: [hidden email]
Subject: Re: server for large data files

Hi Sylvie,

We've been looking into systems, this one is quite appealing:

http://www.drobo.com/

However we have no experience of using it, does anyone else?

Cheers

Neil


________________________________

Date: Fri, 18 Jun 2010 20:01:06 +0530
From: [hidden email]
Subject: Re: server for large data files
To: [hidden email]


 
Hi Sylvie Le Guyader,
 
The best thing to do is to " RAID " the hard disks of your server.
 
RAID, an acronym for redundant array of independent disks or redundant array of inexpensive disks, is a technology that provides increased storage reliability through redundancy <http://en.wikipedia.org/wiki/Redundancy_%28engineering%29> , combining multiple low-cost, less-reliable disk drive components into a logical unit where all drives in the array are interdependent. (-Wikipedia)
 
Although you have many options , I use and would strongly recommend  RAID 1
 
You just need to have  2 or 3 identical hard drives with equal capacity (for eg 2TB + 2TB+ 2TB)
Here data is written identically to multiple disks (a "mirrored set"). Although many implementations create sets of 2 disks, sets may contain 3 or more disks. Array provides fault tolerance from disk errors or failures and continues to operate as long as at least one drive in the mirrored set is functioning. Increased read performance occurs when using a multi-threaded <http://en.wikipedia.org/wiki/Multi-threaded>  operating system that supports split seeks, as well as a very small performance reduction when writing. Using RAID 1 with a separate controller for each disk is sometimes called duplexing. (-Wikipedia)
Hope this helps. Please check the following link http://en.wikipedia.org/wiki/RAID

Best,
 
Radha

 
On Fri, Jun 18, 2010 at 5:43 PM, Sylvie Le Guyader <[hidden email]> wrote:


        Dear all

         

        We often generate data files that are 50-100Gb. I suppose the only solution for us it to set up a server to save/back up our images.

         

        Does anyone have any experience with very large amounts of data?

         

        Med vänlig hälsning / Best regards

         

        Sylvie

         

        @@@@@@@@@@@@@@@@@@@@@@@@

        Sylvie Le Guyader

        Live Cell Imaging Unit

        Dept of Biosciences and Nutrition

        Karolinska Institutet

        Sweden

        office: +46 (0)8 608 9240

        mobile: +46 (0) 73 733 5008

         



________________________________

Get a free e-mail account with Hotmail. Sign-up now.

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.829 / Virus Database: 271.1.1/2946 - Release Date: 06/18/10 02:35:00
Craig Brideau Craig Brideau
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

In reply to this post by Neil Kad
Our lab uses an Apple XServe server coupled to a Drobo Pro via Firewire.
Originally we had an Apple drive array, but it was for the older
parallel ATA drives so it was getting pretty obsolete.  The Drobo Pro
is Serial ATA (SATA).  One thing we noticed about it was that the file
access speeds with the Drobo are a bit slow compared to the old
system, but the old system was SAS which is pretty high-end.  The
Drobo Pro in comparison was about 1/8 of the price for about 50% of
the speed and 300% more storage.  Another nice thing about the Drobo
is you don't have to have all the drives be the same make and model,
or even size.  The onboard RAID-like algorithm will manage any drive
you give it, although smaller drives will decrease maximum storage
capacity.  We loaded ours with Seagate 1.5 TB 'LP' models, which are a
low-power version of their hard drive.  The key advantage of the LP
model is it runs cooler than regular models.  When you have eight HD's
racked up next to each other the reduced heat load is a big plus!

Craig


On Fri, Jun 18, 2010 at 8:38 AM, Neil Kad <[hidden email]> wrote:

> Hi Sylvie,
>
> We've been looking into systems, this one is quite appealing:
>
> http://www.drobo.com/
>
> However we have no experience of using it, does anyone else?
>
> Cheers
>
> Neil
>
> ________________________________
> Date: Fri, 18 Jun 2010 20:01:06 +0530
> From: [hidden email]
> Subject: Re: server for large data files
> To: [hidden email]
>
>
> Hi Sylvie Le Guyader,
>
> The best thing to do is to " RAID " the hard disks of your server.
>
> RAID, an acronym for redundant array of independent disks or redundant array
> of inexpensive disks, is a technology that provides increased storage
> reliability through redundancy, combining multiple low-cost, less-reliable
> disk drive components into a logical unit where all drives in the array are
> interdependent. (-Wikipedia)
>
> Although you have many options , I use and would strongly recommend  RAID 1
>
> You just need to have  2 or 3 identical hard drives with equal capacity (for
> eg 2TB + 2TB+ 2TB)
> Here data is written identically to multiple disks (a "mirrored set").
> Although many implementations create sets of 2 disks, sets may contain 3 or
> more disks. Array provides fault tolerance from disk errors or failures and
> continues to operate as long as at least one drive in the mirrored set is
> functioning. Increased read performance occurs when using a multi-threaded
> operating system that supports split seeks, as well as a very small
> performance reduction when writing. Using RAID 1 with a separate controller
> for each disk is sometimes called duplexing. (-Wikipedia)
> Hope this helps. Please check the following link
> http://en.wikipedia.org/wiki/RAID
> Best,
>
> Radha
>
> On Fri, Jun 18, 2010 at 5:43 PM, Sylvie Le Guyader <[hidden email]>
> wrote:
>
> Dear all
>
>
>
> We often generate data files that are 50-100Gb. I suppose the only solution
> for us it to set up a server to save/back up our images.
>
>
>
> Does anyone have any experience with very large amounts of data?
>
>
>
> Med vänlig hälsning / Best regards
>
>
>
> Sylvie
>
>
>
> @@@@@@@@@@@@@@@@@@@@@@@@
>
> Sylvie Le Guyader
>
> Live Cell Imaging Unit
>
> Dept of Biosciences and Nutrition
>
> Karolinska Institutet
>
> Sweden
>
> office: +46 (0)8 608 9240
>
> mobile: +46 (0) 73 733 5008
>
>
>
> ________________________________
> Get a free e-mail account with Hotmail. Sign-up now.
Mario-2 Mario-2
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

In reply to this post by Sylvie Le Guyader-2
Re: server for large data files
Sylvie,

When you say "data files that are 50-100Gb," do you mean single files or files in aggregate on a daily basis? What does this amount to per month or per year? 100 GB  is easy to do especially when collecting video rate image sets (~ 20 min uncompressed 16-bit/pixel HD or ~200 x 128 full frame image stacks).

Craig's suggestion using low power drives sounds like a very good option. I wonder what you do after that because with daily acquisition of that much data you will fill the drive(s) in short order.

Blue ray storage disks can get you something like 50 gigs for dual layer double sided disks, which may be adequate. A couple of years ago Pioneer announced a 16 layer disk that could get 400 GB per disk.

Sylvie, do you or anyone else know whether such a system using relatively non-volatile optical data encoding is available yet?


Dear all
 
We often generate data files that are 50-100Gb. I suppose the only solution for us it to set up a server to save/back up our images.
 
Does anyone have any experience with very large amounts of data?
 
Med vänlig hälsning / Best regards
 
Sylvie
 
@@@@@@@@@@@@@@@@@@@@@@@@
Sylvie Le Guyader
Live Cell Imaging Unit
Dept of Biosciences and Nutrition
Karolinska Institutet
Sweden
office: +46 (0)8 608 9240
mobile: +46 (0) 73 733 5008
 


--
________________________________________________________________________________
Mario M. Moronne, Ph.D.

[hidden email]
[hidden email]
Craig Brideau Craig Brideau
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

In reply to this post by Sylvie Le Guyader-2
Yeah, 50-100GB is huge for a file.  A typical imaging session on our
spectral confocal (C1Si) is usually 2-6 GB, depending on spectral
density (up to 32 channels) and if large volumes are acquired.  I hope
you mean 50-100GB per imaging session, or that this is a typo and you
meant MB...  Handling 0.1 TB per file would be ugly.  Our Drobo is
good to 10 TB, but if what you are saying is correct that's only about
100 files/data sets for you.  You may want to have a look at tape
backup if you really are getting files that big.  I'd still recommend
a Drobo or other network attached storage just as a place to stash
files for processing.  Once you are done analyzing/manipulating a data
set move it to tape backup from the hard disk array.  Tapes are really
cheap, hold quite a bit, but are slow, so they are best for offline
archival.  The network attached storage (Drobo, etc) can't hold as
much as a bin full of tapes but are reasonably quick and would allow
you to work on the data.

Links on general tape backup information:

http://findarticles.com/p/articles/mi_m0BRZ/is_8_20/ai_65513193/

http://www.zdnet.com/blog/ou/are-tape-backup-systems-obsolete/267

Craig

On Fri, Jun 18, 2010 at 11:17 AM, Mario <[hidden email]> wrote:

> Sylvie,
> When you say "data files that are 50-100Gb," do you mean single files or
> files in aggregate on a daily basis? What does this amount to per month or
> per year? 100 GB  is easy to do especially when collecting video rate image
> sets (~ 20 min uncompressed 16-bit/pixel HD or ~200 x 128 full frame image
> stacks).
> Craig's suggestion using low power drives sounds like a very good option. I
> wonder what you do after that because with daily acquisition of that much
> data you will fill the drive(s) in short order.
> Blue ray storage disks can get you something like 50 gigs for dual layer
> double sided disks, which may be adequate. A couple of years ago Pioneer
> announced a 16 layer disk that could get 400 GB per disk.
> Sylvie, do you or anyone else know whether such a system using relatively
> non-volatile optical data encoding is available yet?
>
> Dear all
>
>
>
> We often generate data files that are 50-100Gb. I suppose the only solution
> for us it to set up a server to save/back up our images.
>
>
>
> Does anyone have any experience with very large amounts of data?
>
>
>
> Med vänlig hälsning / Best regards
>
>
>
> Sylvie
>
>
>
> @@@@@@@@@@@@@@@@@@@@@@@@
>
> Sylvie Le Guyader
>
> Live Cell Imaging Unit
>
> Dept of Biosciences and Nutrition
>
> Karolinska Institutet
>
> Sweden
>
> office: +46 (0)8 608 9240
>
> mobile: +46 (0) 73 733 5008
>
>
>
> --
>
> ________________________________________________________________________________
> Mario M. Moronne, Ph.D.
>
> [hidden email]
> [hidden email]
>
Cameron Nowell Cameron Nowell
Reply | Threaded
Open this post in threaded view
|

Re: server for large data files

Hi Guys,
 
Thought i would chip in my thoughts as well. Storage coapacity isn't the only thing to think of. Some peopel have already mentioned redundancy and/or backup which is vital. the other thing to consider is data access speeds. How many peopel will you have access the data at the same time? A Drobo or other similar business class NAS array system can handle a small group of users but if you are going to have large ammounts (lets say 10 or more) access the data at the same time you will need to look at enterprise (read expensive) solutions. You will also need to consider the speed and stability of all your network infrastructure. Having a new shinny NAS unit that has gigabit capable netwrok speeds is no good if it is plugged into an older 100mb network or has cheap flaky switches and controllers.
 
Data storage for imaging is becoming a real issue. We are all getting systems capable of generating very large data sets many times a week. But we do not always have the ITC support or funding to be able to fully store and safely backup all the data.
 
Enterprise level servers while expensive(ish) for initial setup can be fairly cheap (say $1-2,000) a TB to expand once installed.
 
 
Cheers
 
Cam
 
 
 
Cameron J. Nowell
Microscpy Manager
Central Resource for Advanced Microscopy
Ludwig Insttue for Cancer Research
PO Box 2008
Royal Melbourne Hospital
Victoria, 3050
AUSTRALIA
 
Office: +61 3 9341 3155
Mobile: +61422882700
Fax: +61 3 9341 3104
 
http://www.ludwig.edu.au/branch/research/platform/microscopy.htm
 

________________________________

From: Confocal Microscopy List on behalf of Craig Brideau
Sent: Sat 19/06/2010 3:53 AM
To: [hidden email]
Subject: Re: server for large data files



Yeah, 50-100GB is huge for a file.  A typical imaging session on our
spectral confocal (C1Si) is usually 2-6 GB, depending on spectral
density (up to 32 channels) and if large volumes are acquired.  I hope
you mean 50-100GB per imaging session, or that this is a typo and you
meant MB...  Handling 0.1 TB per file would be ugly.  Our Drobo is
good to 10 TB, but if what you are saying is correct that's only about
100 files/data sets for you.  You may want to have a look at tape
backup if you really are getting files that big.  I'd still recommend
a Drobo or other network attached storage just as a place to stash
files for processing.  Once you are done analyzing/manipulating a data
set move it to tape backup from the hard disk array.  Tapes are really
cheap, hold quite a bit, but are slow, so they are best for offline
archival.  The network attached storage (Drobo, etc) can't hold as
much as a bin full of tapes but are reasonably quick and would allow
you to work on the data.

Links on general tape backup information:

http://findarticles.com/p/articles/mi_m0BRZ/is_8_20/ai_65513193/

http://www.zdnet.com/blog/ou/are-tape-backup-systems-obsolete/267

Craig

On Fri, Jun 18, 2010 at 11:17 AM, Mario <[hidden email]> wrote:

> Sylvie,
> When you say "data files that are 50-100Gb," do you mean single files or
> files in aggregate on a daily basis? What does this amount to per month or
> per year? 100 GB  is easy to do especially when collecting video rate image
> sets (~ 20 min uncompressed 16-bit/pixel HD or ~200 x 128 full frame image
> stacks).
> Craig's suggestion using low power drives sounds like a very good option. I
> wonder what you do after that because with daily acquisition of that much
> data you will fill the drive(s) in short order.
> Blue ray storage disks can get you something like 50 gigs for dual layer
> double sided disks, which may be adequate. A couple of years ago Pioneer
> announced a 16 layer disk that could get 400 GB per disk.
> Sylvie, do you or anyone else know whether such a system using relatively
> non-volatile optical data encoding is available yet?
>
> Dear all
>
>
>
> We often generate data files that are 50-100Gb. I suppose the only solution
> for us it to set up a server to save/back up our images.
>
>
>
> Does anyone have any experience with very large amounts of data?
>
>
>
> Med vänlig hälsning / Best regards
>
>
>
> Sylvie
>
>
>
> @@@@@@@@@@@@@@@@@@@@@@@@
>
> Sylvie Le Guyader
>
> Live Cell Imaging Unit
>
> Dept of Biosciences and Nutrition
>
> Karolinska Institutet
>
> Sweden
>
> office: +46 (0)8 608 9240
>
> mobile: +46 (0) 73 733 5008
>
>
>
> --
>
> ________________________________________________________________________________
> Mario M. Moronne, Ph.D.
>
> [hidden email]
> [hidden email]
>