*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Hello everyone, We are currently exploring in lab data storage options for storing the large amounts of data our light-sheet microscopes are producing (while simultaneously providing shared access from multiple PCs for processing and/or visualization). Given these needs, we have tentatively arrived at purchasing a NAS server. I was wondering if anyone had suggestions or recommendations on the best companies/options available. In preliminary searching I have seen FreeNAS pop up several times, but am quite a novice when it comes to data storage - so any input from the listserv would be greatly appreciated! Thanks! Adam |
Michael Giacomelli |
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Hi Adam, We use 12 bay Synology NAS devices. Currently the RS2416RP+ model. Each rack mount device can further support an expansion bay, for a total of 24 disks per device. Cost for the base device is about 2300, then when that fills up, another 1200 for an expansion bay. That works out to be about 3500 worth of server per 24 disks, or $150 of server per disk. Setup and management is easy, the software automatically updates itself, emails you if there is a problem or disk failure. We run two disk redundancy per 12 disk set. Performance is acceptable for what we do (dump a few hundred GB of images at a time into long term storage) but you could do better if you needed, the CPUs are quite slow. We have bought them from SimplyNAS before. They are ok, although they did mess up one order in the past and we had to exchange the hardware they sent, but we like that they will bundle the disks in with the unit, which is helpful for us for accounting reasons. There are many other vendors (Amazon, etc). BackBlaze publishes disk reliability, so we typically buy what they say is reliable: https://www.backblaze.com/b2/hard-drive-test-data.html You can do cheaper using PC hardware and storage optimized linux distros, but we typically run the racks for ~7 years before retiring them, which means you need to make sure someone will be around that knows how to rebuild a RAID array when a disk fails, which happens from time to time. We were concerned about this since students leave over time and we don't necessarily have someone available any given year who knows older linux systems. Mike On Tue, May 8, 2018 at 4:41 PM, Adam Glaser <[hidden email]> wrote: > ***** > To join, leave or search the confocal microscopy listserv, go to: > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy > Post images on http://www.imgur.com and include the link in your posting. > ***** > > Hello everyone, > > We are currently exploring in lab data storage options for storing the large amounts of data our light-sheet microscopes are producing (while simultaneously providing shared access from multiple PCs for processing and/or visualization). Given these needs, we have tentatively arrived at purchasing a NAS server. I was wondering if anyone had suggestions or recommendations on the best companies/options available. In preliminary searching I have seen FreeNAS pop up several times, but am quite a novice when it comes to data storage - so any input from the listserv would be greatly appreciated! > > Thanks! > Adam |
Mel Symeonides |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Hi Adam, (warning: long email ahead) For our iSPIM we have the 8-bay Synology DS1817+ (about $900-$1000) loaded with 8 x Western Digital Gold WD101KRYZ 10 TB hard drives (about $400 each), and set up in SHR-2 mode (which is basically RAID-6, i.e. 2-disk redundancy) which leaves about 60 TB of usable space. I also installed an Intel X520-DA2 10GbE adapter as the integrated Gigabit Ethernet connection would bottleneck the speed of the drive array, I was able to find one of those for $150 but it was a refurb. You'll also need a 10GbE adapter for your PC and the appropriate SFP+ cable(s). Once this fills up, and we have exhausted all possibilities of compressing or discarding data, I think we will just buy another one as opposed to buying an expansion bay for this one, as the expansion bay runs on the eSATA connector and is therefore limited in speed, so you get no speed gains from any RAID array that's in the expansion bay. Just be aware that, with regular SATA 7.2K pm HDDs (i.e. not SSDs or 15K rpm SAS drives), the best real-world performance you will ever get (with an 8-drive RAID-6 array over 10GbE) is about 160-170 MB/s read (that's bytes, not bits) and 270 MB/s write. Teaming two 10 GbE connections by link aggregation does nothing for real-world performance, it's just for failover redundancy in case one port or one cable dies mid-transfer - for speed benefits you need to optimize your operations to split up your packet stream across each link, which is not trivial. These speeds are way lower than what you get with a benchmarking tool like CrystalDiskMark - always pays to do real-world work to figure out how something performs. This kind of performance means that you can pretty much forget any multi-user applications with this kind of NAS for large imaging datasets. You will just not be able to provide access for anything more than someone copying their data off to their own system to then process and analyze off a local drive/array, and you should only allow one user to be logged in at a time or they will all complain that it's taking ages. Also, if you'll be using your building's built-in Ethernet wiring, it's unlikely that it will be capable of 10GbE performance, so really you'll end up asking people to bring their portable drive down to you and copy the data right off as it'll end up being 2-3x faster (of course, that also presumes that their portable drive isn't bottlenecking either). I also have a 6 TB RAID-0 SSD array in my workstation PC, so whenever I need to process/analyze data, I copy it off the NAS to the SSD RAID over the 10GbE connection, do all the work on the SSD RAID, and then copy it back to the NAS for storage. If you want to actually serve large datasets directly to multiple simultaneous users, these Synology units will just not cut it. You'll need an enterprise solution, e.g. something from Dell, HP, or NetApp, which will cost somewhere in the tens of thousands USD for anything close to the 60 TB I mentioned. These will run large arrays of either SSDs (super expensive) or at least fast SAS HDDs (only very expensive), and will run on a multi-CPU Xeon platform with tons of RAM that is capable of handling multiple users. The Synology units have really awful CPUs and very low RAM, which can just barely handle a single user. You'll also need the necessary infrastructure in your building to allow remote users to connect at 10GbE speeds (and have them install 10GbE adapters in their computers), or it will totally defeat the point of having spent all that money on the enterprise server. If you're really nuts you can try to build your own and could conceivably get performance close to what these enterprise units get for somewhat less money, but you really need to know what you're doing. Plus you will be entirely responsible for servicing it. You'll need to have a drive failure plan because server performance will tank for several days if you have to replace a failed drive and have to rebuild the array - your users will be very unhappy during the rebuild, and the larger the array/the slower the drives, the longer the rebuild will take. Honestly, institutions that host big data labs will sooner or later need to start taking responsibility for the data if they are going to expect their labs to continue doing their work and bringing in grant money for the institution. It makes much more sense for there to be an institutional "data core facility" to maximize cost efficiency in purchasing these servers, buildings will need to be rewired with fiber, and personnel will need to be hired to maintain these servers and provide user support, all of which costs institutions $$$, so good luck with that. No commercial interest for anything I mentioned. Mel On 5/8/2018 4:41 PM, Adam Glaser wrote: > ***** > To join, leave or search the confocal microscopy listserv, go to: > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy > Post images on http://www.imgur.com and include the link in your posting. > ***** > > Hello everyone, > > We are currently exploring in lab data storage options for storing the large amounts of data our light-sheet microscopes are producing (while simultaneously providing shared access from multiple PCs for processing and/or visualization). Given these needs, we have tentatively arrived at purchasing a NAS server. I was wondering if anyone had suggestions or recommendations on the best companies/options available. In preliminary searching I have seen FreeNAS pop up several times, but am quite a novice when it comes to data storage - so any input from the listserv would be greatly appreciated! > > Thanks! > Adam > -- Menelaos Symeonides Post-Doctoral Associate, Thali Lab Department of Microbiology and Molecular Genetics University of Vermont 318 Stafford Hall 95 Carrigan Dr Burlington, VT 05405 [hidden email] Phone: 802-656-1161 |
Steffen Dietzel |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Hi Adam, we are using a Synology DS3612xs. 12 bays, each filled with a 4 GByte disk. That was a huge size when we got it several years back (2011?). It runs in a RAID 6 configuration which paid of when one of the disks failed. An annoying beep alerted me to get a new one. It felt rather comfortable at the time that another one could fail without ruining the data. The major advantage of this machine is that it does not need much care. Once installed it just runs. The web interface is easy to use, the software updates itself automatically. Adding additional HDs is very easy (we started out with 5, now 12). I liked it so much that I got myself a 2-bay Synology at home. Same user interface. When we got to a new building I added a 10 Gbit glas fiber network card. While in theory this should give us ~1 Gbyte/s, in reality it achieves ~500 MByte per second. Still not bad compared to the max 120 Mbyte on the 1Gbit Ethernet. I have no experience with other NAS systems and (sadly) no financial interest in Synology. Steffen Am 08.05.2018 um 22:41 schrieb Adam Glaser: > ***** > To join, leave or search the confocal microscopy listserv, go to: > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy > Post images on http://www.imgur.com and include the link in your posting. > ***** > > Hello everyone, > > We are currently exploring in lab data storage options for storing the large amounts of data our light-sheet microscopes are producing (while simultaneously providing shared access from multiple PCs for processing and/or visualization). Given these needs, we have tentatively arrived at purchasing a NAS server. I was wondering if anyone had suggestions or recommendations on the best companies/options available. In preliminary searching I have seen FreeNAS pop up several times, but am quite a novice when it comes to data storage - so any input from the listserv would be greatly appreciated! > > Thanks! > Adam > ------------------------------------------------------------ Steffen Dietzel, PD Dr. rer. nat Ludwig-Maximilians-Universität München Biomedical Center (BMC) Head of the Core Facility Bioimaging Großhaderner Straße 9 D-82152 Planegg-Martinsried Germany http://www.bioimaging.bmc.med.uni-muenchen.de |
Kate Luby-Phelps |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Other options besides Synology are Buffalo and QNAP. We have some of each. The Buffalo Terastations are particularly reliable. We have had them for more than ten years and disks only recently started failing. You can order Western Digital disks and replace them yourself and the RAID array rebuilds automatically. We don't use our NAS for archiving, just for temporary storage and data serving. You don't have to know Unix to administer either Buffalo or QNAP since they have web GUIs for that. Both Buffalo and QNAP have 10 Gbps NAS. I have been very happy with the telephone support for both Buffalo and QNAP. As has already been mentioned, it is the read/write speed of the disks that sets the upper limit on data transfer rates, but for large files 10 Gbps will be noticeably faster you have it available. |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Thanks everyone for your responses, this has been very helpful. It sounds like there are many things to consider, especially if we are hoping to interact with the data while it is on the server - rather than just dump the data purely for long term storage. Two scenarios we are considering are: 1. Record a dataset (0.5 - 1.5 TB in size) on our acquisition workstation (4x2TB RAID0 SSDs, equipped with 10G fiber, 256 GB RAM) -> post-process on this same workstation -> transfer final processed dataset (we use the HDF5-based Imaris format) to NAS server for storage. - In this scenario, would it be feasible to open, visualize, and interact with this IMS file from 1 or 2 additional workstations via 10Gb with similar specs as the acquisition workstation while it sits on the NAS server? And in terms of RAM, is the important factor in this scenario how much RAM the workstation has, or the NAS server has? 2. Record a dataset (0.5 - 1.5 TB in size) on our acquisition workstation (4x2TB RAID0 SSDs, equipped with 10G fiber, 256 GB RAM) -> transfer raw data to NAS server -> post-process data on the NAS server using a secondary workstation with similar specs to the acquisition workstation via 10Gb. - Would this scheme be feasible, and again is the limiting factor for our post-processing the RAM/CPU on the NAS server or the RAM/CPU on the secondary workstation? Based on everyone's input, we are considering the Synology 16 bay NAS RackStation RS4017xs+. The specs include the 16 bays, 8 core processor, up to 8 GB (expandable to 64 GB) RAM, two 10 Gb fiber ports, and two additional PCIe 3.0 slots. However, if we will not be able to interact with the data as planned while it is on the NAS server, we may just use AWS or on-campus options for simply storing and backing up the data. What initially attracted us to the idea of an in-lab NAS server was the combination of storage + accessibility of our datasets. Apologies if these are confusing or naive questions! Thanks, Adam |
Mel Symeonides |
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Hey Adam, For option 1, if you're looking to do real-time 3D/4D visualization of the data straight off the NAS, it will be very slow (I bet it's already plenty slow even off your SSD RAID for very large volumes... imagine about 10% of that speed off the NAS). The bottleneck here will be the speed of the HDD array on the NAS, which is in turn related to the drive interface (in the case of that Synology NAS, it's SATA, which is the slowest) as well as the CPU and RAM of the NAS. Your workstation is plenty powerful for any NAS and will never be the bottleneck, and the 10GbE interface will not be maxed out unless your NAS is loaded with SSDs. You will see better performance if the drives on the NAS are connected via SAS (which the NAS would specifically have to support), which run full-duplex (simultaneous read-write streams) and spin at much higher speeds (10K or 15K RPM vs. 7.2K RPM for most SATA drives), but honestly at that point I would maybe just go with SSDs (we're talking huge cost increases at this point either way). You will most likely see some performance gains in real-world operations with a cheap/slow SATA HDD array by increasing the RAM on the NAS (the more the better), and using those PCIe slots for NVME SSD caching. Option 2 might work, depending on how many datasets of that size you think you'll be getting every day/week and what processing is required (I'd guess deconvolution, deskewing, masking/thresholding?). You should expect that it'll take a day or more to process a dataset that lives on the NAS, whereas it'll take a few hours if it's on a local SSD RAID. However, that sounds like quite a bit of a higher cost to set up a separate analysis workstation just to cope with the slowness of the NAS. Would it just make more sense to have a combined acquisition/analysis workstation and have scripted analysis run when you're not acquiring new data, e.g. overnight? Mel On 5/9/2018 11:25 AM, Adam Glaser wrote: > ***** > To join, leave or search the confocal microscopy listserv, go to: > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy > Post images on http://www.imgur.com and include the link in your posting. > ***** > > Thanks everyone for your responses, this has been very helpful. It sounds like there are many things to consider, especially if we are hoping to interact with the data while it is on the server - rather than just dump the data purely for long term storage. Two scenarios we are considering are: > > 1. Record a dataset (0.5 - 1.5 TB in size) on our acquisition workstation (4x2TB RAID0 SSDs, equipped with 10G fiber, 256 GB RAM) -> post-process on this same workstation -> transfer final processed dataset (we use the HDF5-based Imaris format) to NAS server for storage. > > - In this scenario, would it be feasible to open, visualize, and interact with this IMS file from 1 or 2 additional workstations via 10Gb with similar specs as the acquisition workstation while it sits on the NAS server? And in terms of RAM, is the important factor in this scenario how much RAM the workstation has, or the NAS server has? > > 2. Record a dataset (0.5 - 1.5 TB in size) on our acquisition workstation (4x2TB RAID0 SSDs, equipped with 10G fiber, 256 GB RAM) -> transfer raw data to NAS server -> post-process data on the NAS server using a secondary workstation with similar specs to the acquisition workstation via 10Gb. > > - Would this scheme be feasible, and again is the limiting factor for our post-processing the RAM/CPU on the NAS server or the RAM/CPU on the secondary workstation? > > Based on everyone's input, we are considering the Synology 16 bay NAS RackStation RS4017xs+. The specs include the 16 bays, 8 core processor, up to 8 GB (expandable to 64 GB) RAM, two 10 Gb fiber ports, and two additional PCIe 3.0 slots. However, if we will not be able to interact with the data as planned while it is on the NAS server, we may just use AWS or on-campus options for simply storing and backing up the data. What initially attracted us to the idea of an in-lab NAS server was the combination of storage + accessibility of our datasets. > > Apologies if these are confusing or naive questions! > > Thanks, > Adam |
Michael Giacomelli |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** In my experience, performance on the Synology machines scales with the number file accesses per second quite linearly, so if you do a lot of large files, you will easily bottleneck on 1 gbit ethernet while the load on the server will be small even with the cheaper models that use slow Intel Atom processors. If you try to do small files (e.g. a lot of 50 KB JPEG like when making Deepzoom images), you end up completely bottlenecking the Synology. With a higher end Synology like you're looking at (probably 2 or 3x the CPU power), more disks, and relatively large image files, extrapolating, think you'll have at least a few hundred MB/s even with multiple concurrent users. If you think you may need faster than that, it might be worth looking up some benchmarks. High performance NAS is a big market, so I'm sure there is information out there. Mike On Wed, May 9, 2018 at 11:25 AM, Adam Glaser <[hidden email]> wrote: > ***** > To join, leave or search the confocal microscopy listserv, go to: > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy > Post images on http://www.imgur.com and include the link in your posting. > ***** > > Thanks everyone for your responses, this has been very helpful. It sounds like there are many things to consider, especially if we are hoping to interact with the data while it is on the server - rather than just dump the data purely for long term storage. Two scenarios we are considering are: > > 1. Record a dataset (0.5 - 1.5 TB in size) on our acquisition workstation (4x2TB RAID0 SSDs, equipped with 10G fiber, 256 GB RAM) -> post-process on this same workstation -> transfer final processed dataset (we use the HDF5-based Imaris format) to NAS server for storage. > > - In this scenario, would it be feasible to open, visualize, and interact with this IMS file from 1 or 2 additional workstations via 10Gb with similar specs as the acquisition workstation while it sits on the NAS server? And in terms of RAM, is the important factor in this scenario how much RAM the workstation has, or the NAS server has? > > 2. Record a dataset (0.5 - 1.5 TB in size) on our acquisition workstation (4x2TB RAID0 SSDs, equipped with 10G fiber, 256 GB RAM) -> transfer raw data to NAS server -> post-process data on the NAS server using a secondary workstation with similar specs to the acquisition workstation via 10Gb. > > - Would this scheme be feasible, and again is the limiting factor for our post-processing the RAM/CPU on the NAS server or the RAM/CPU on the secondary workstation? > > Based on everyone's input, we are considering the Synology 16 bay NAS RackStation RS4017xs+. The specs include the 16 bays, 8 core processor, up to 8 GB (expandable to 64 GB) RAM, two 10 Gb fiber ports, and two additional PCIe 3.0 slots. However, if we will not be able to interact with the data as planned while it is on the NAS server, we may just use AWS or on-campus options for simply storing and backing up the data. What initially attracted us to the idea of an in-lab NAS server was the combination of storage + accessibility of our datasets. > > Apologies if these are confusing or naive questions! > > Thanks, > Adam |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Thanks Mike and everyone else for the continued advice. It sounds like although we would not get maximum theoretical transfer speeds to a NAS server, with a larger drive, more server RAM and CPU, we should be able to get relatively decent speeds as well as many TB of storage. I have one last question for everyone. Has anyone tested streaming data from sCMOS cameras to a PCIe SSD? I understand that the norm is streaming to a RAID0 SATA SSD array, like we have now in our workstation. The advantage with the SATA SSD being that you get a proportionally larger drive size and speed, e.g. 8 TB and >1 GB/sec speed for our setup with 4x2 TB SSDs. The latest PCIe SSD (like the Samsung 960 Evo/Pro) get equivalent speeds but without the ability to proportionally increase the drive size. But - since our raw data set sizes are generally around 1 TB, a single 2 TB PCIe SSD would be able to capture an entire dataset for us. Then, after acquisition, the raw data could be dumped from the PCIe SSD to the HDD RAID array in the SATA slots. This would provide many TB of storage, albeit at reduced speeds. It would allow a single workstation to acquire data and also have increased storage (although not as large as a NAS server with 12 or 16 bays). Thanks! Adam |
Michael Giacomelli |
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** Hi Adam, I haven't done sCMOS, but we do stream from Alazar PCIe A/D boards (2-3 GB/s) when doing imaging. Getting close to the benchmark rates is challenging with NVME. You need large transfer blocks, and ideally several running in parallel. The included Alazar software can easily saturate a 960 Evo (it'll do > 6 GB/s to RAM), or a RAID0 of 4x840 Pros, but I never figured out what they were doing to make it work so well. I tested a few different Win32 APIs, but ended up just doing fopen/fwrite and then spinning off a lot of parallel threads for writing. That more or less saturated a single 960 Evo's sustained performance (see below), but I'm not sure how well it'd scale to faster disks. By the way, I'd recommend against getting a 960 Evo. They have very fast burst writes to the device's SLC cache, but sustained performance once the SLC is exhausted and you fall back to TLC memory is worse. The 960 Pro is MLC, will sustain higher data rates if you want to completely fill the disk in one go. You could also get several cheaper NVMe devices (EVOs or other TLC) and software RAID them (which works surprisingly well) if you don't want to pay for the Pro or need more sustained write than a single device can handle. Some newer boards support splitting the PCIe x16 slot into 4xM.2, in which case you can fit a lot of NVME devices in one system. Mike On Thu, May 10, 2018 at 2:01 PM, Adam Glaser <[hidden email]> wrote: > ***** > To join, leave or search the confocal microscopy listserv, go to: > http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy > Post images on http://www.imgur.com and include the link in your posting. > ***** > > Thanks Mike and everyone else for the continued advice. It sounds like although we would not get maximum theoretical transfer speeds to a NAS server, with a larger drive, more server RAM and CPU, we should be able to get relatively decent speeds as well as many TB of storage. > > I have one last question for everyone. Has anyone tested streaming data from sCMOS cameras to a PCIe SSD? > > I understand that the norm is streaming to a RAID0 SATA SSD array, like we have now in our workstation. The advantage with the SATA SSD being that you get a proportionally larger drive size and speed, e.g. 8 TB and >1 GB/sec speed for our setup with 4x2 TB SSDs. > > The latest PCIe SSD (like the Samsung 960 Evo/Pro) get equivalent speeds but without the ability to proportionally increase the drive size. But - since our raw data set sizes are generally around 1 TB, a single 2 TB PCIe SSD would be able to capture an entire dataset for us. Then, after acquisition, the raw data could be dumped from the PCIe SSD to the HDD RAID array in the SATA slots. This would provide many TB of storage, albeit at reduced speeds. It would allow a single workstation to acquire data and also have increased storage (although not as large as a NAS server with 12 or 16 bays). > > Thanks! > Adam |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** On Thu, May 10, 2018 at 12:01 PM Adam Glaser <[hidden email]> wrote: > I have one last question for everyone. Has anyone tested streaming data > from sCMOS cameras to a PCIe SSD? > > I understand that the norm is streaming to a RAID0 SATA SSD array, like we > have now in our workstation. The advantage with the SATA SSD being that > you get a proportionally larger drive size and speed, e.g. 8 TB and >1 > GB/sec speed for our setup with 4x2 TB SSDs. > > The latest PCIe SSD (like the Samsung 960 Evo/Pro) get equivalent speeds > but without the ability to proportionally increase the drive size. But - > since our raw data set sizes are generally around 1 TB, a single 2 TB PCIe > SSD would be able to capture an entire dataset for us. Then, after > acquisition, the raw data could be dumped from the PCIe SSD to the HDD RAID > array in the SATA slots. This would provide many TB of storage, albeit at > reduced speeds. It would allow a single workstation to acquire data and > also have increased storage (although not as large as a NAS server with 12 > or 16 bays). > The bleeding edge right now are solid-state M.2 and U.2 drives that use the PCIe bus for communication with the mainboard. You can buy PCIe cards that host these drives and their access speed is greater than any of the available SATA formats. Each drive takes 4x PCIe lanes though so you need to make sure you have the available lanes on your motherboard. The newer Ryzen chips tend to support more PCIe lanes on the motherboard. Craig |
In reply to this post by Adam Glaser
*****
To join, leave or search the confocal microscopy listserv, go to: http://lists.umn.edu/cgi-bin/wa?A0=confocalmicroscopy Post images on http://www.imgur.com and include the link in your posting. ***** *** please note that I have commercial interests in what I write here in that I am working with the mentioned solution provider in a number of projects *** Hello Adam, Hello List, I hope I can add a bit to the earlier posts, since all of them raised already the relevant topics. I look at this topic with the following background: I have been working on a number of light sheet projects and an appropriate data solution has always been a key part for the successful use of the setups. My experiences include commercial systems (e.g., Zeiss, Luxendo, ASI/3i) and home build light sheet systems. As a reference for data rates, I always keep in mind that 1 sCMOS camera, 4 MPixel, 16 bit image acquisition at full speed (100 fps) delivers approximately 800 MByte/sec. Hence, most light sheet systems easily produce 800 Mbyte/ sec or even 1.6 GByte/sec over extended periods of time. My major learnings, also with respect to your 2 scenarios, were - Copying the data from a primary acquisition volume to a second volume for processing or safer/longer storage has always become a bottleneck very soon. I'd recommend to go for a solution that allows delivery of the data from the microscope straight to the storage where it is supposed to stay and where it is safe and at the same time accessible for processing. This involves fast and safe volumes and fast networks. Typically larger RAID systems of SAS HDDs (e.g., 15+ SAS enterprise HDDs will build the basis, suitable RAID controllers and multiple dedicated 10 Gbit network connections are needed. If well configured, you can get > 1 Gbyte/ sec saved and at the same time keep capacities for simultaneous processing i/o. The RAID controller is a key component for fast i/o! Many of them fail to support the data rates that the disc arrays per se would allow. - Pulling data over network into a processing unit is slow, especially when multiple users are supposed to work simultaneously. A 10 Gbit label on the cable/adapter will not necessarily save you. Maybe a direct 10 Gbit connection is ok, but the built-in network of a research institute will typically not get you the performance you need. This is even more important when working with I/O intensive processing like 3D visualization and analysis or Deconvolution. Your processing unit with CPU/RAM/GPU capacities should be "close" to the data. Directly connected over a bus system like SAS. 10 Gbit Ethernet is only 2nd choice here. - Together with the processing resources (RAM, CPU or GPU), the limiting factors are usually network bandwidth and i/o to/from the data volume. - Scalability is essential. If you start with 30 Terabyte today, you should already have a plan from the beginning about how to expand these capacities without re-formatting huge volumes. Also, the option of expanding RAM and number of GPUs is worth considering. Therefore, in my opinion, NAS is often too slow if more than just storage is required. It has been mentioned by the earlier posts that good commercial ("enterprise") solutions are available. Of course, you can build performing solutions yourself, if you know what you are doing and if you have the people and time to do it. If you buy, make sure the provider knows your applications. If they usually supply big data servers for relational databases but don't know what a microscope is, you can just as well configure the system yourself. I have gone through this the hard way. The best solution for image data that I am aware of has been developed and is sold by a joint team of microscopy experts, IT hardware experts and network specialists: The HIVE platform by the company ACQUIFER (https://www.acquifer.de/data-solutions/, [hidden email]). ACQUIFER is a solution provider and they go with you through all steps of assessing your needs, talking to you local IT people, shipping, installing, supporting your applications and servicing the platform. Importantly: they work together with microscope and camera manufacturers, run tests, are keen to support your applications. Therefore, the HIVE is more cost-intensive than a simple NAS or a basic enterprise solution. But you get a most competent partner who also stays with you after the installation is done. The ACQUIFER HIVE is a professional solution, combining fast and safe RAID storage from 50 TB to Petabytes with high end processing units, including solutions for GPU-intensive processing. It is modular, allowing to be configured to your needs (CPU/RAM/GPU - and storage volume), whether it is dedicated to a single high-speed microscope or set up as a central data solution for core facilities. And it can easily grow later. It also comes with a dedicated network module including router, firewall, uninterruptable power supply, multi-10 Gbit network switches for dedicated and direct connection from microscope to storage. It runs Windows Server (multi-user access and easy operation from remote!) - and new releases of common software packages for image processing are regularly tested to ensure they work fine. High data rates and large data sets from all modern microscopy modalities, such as Light-sheet, fast Confocal and Super Resolution Microscopes and also Cryo-Electron Microscopes can be acquired and processed in one multi-user platform. In summary, it is a small computing center that is easy to manage and can go under the desk of the microscope, right where it is needed. If you want to discuss with me offline or if you want me to forward you to the ACQUIFER team, feel free to email me. Best regards, Olaf ____________________________ Dr. Olaf Selchow Microscopy & BioImaging Consulting [hidden email] +49 172 3286313 skype: olaf.selchow |
Free forum by Nabble | Edit this page |