My Ideal Data Warehouse System

Published on March 2017 | Categories: Documents | Downloads: 35 | Comments: 0 | Views: 145
of 20
Download PDF   Embed   Report

Comments

Content

My Data Warehouse Dream Machine: Building the Ideal Data Warehouse
Michael R. Ault, Oracle Guru Texas Memory Systems, Inc

Introduction
Before we can begin to discuss what is needed in a data warehouse (DWH) system, we need to pin down the definition of exactly what a data warehouse is and what it is not. Many companies accumulate a large amount of data and when they reach a certain threshold of either size or complexity they feel they have a data warehouse. However, in the pure sense of the term, they have a large database, not a data warehouse. However, a large non-conforming database masquerading as a data warehouse will probably have more severe performance issues than a database designed from the start to be a data warehouse. Most experts will agree that a data warehouse has to have a specific structure, though the exact structure may differ. For example, the star schema is a classic data warehouse structure. A star schema uses a central “fact” table surrounded by “dimension” tables. The central fact table contains the key values for each of the dimension tables that happen to correspond to a particular set of dimensions. For example, the sales of pink sweaters in Pittsburgh, PA, on Memorial Day in 2007 is an intersection of a STORES, ITEMS, SUPPLIERS, and DATE dimension and a central SALES table as in Figure 1.

Figure 1: Example Star Schema

Generally speaking, data warehouses will be either of the Star or Snowflake (essentially a collection of related Star schemas) design. Of course we need to understand the IO and processing characteristics of the typical DWH system before we describe an ideal system. We must also understand that the ideal system will also depend on the projected size of the DWH. For this paper we will assume a size of 300 gigabytes, which is actually small compared with many companies’ terabyte or multi-terabyte data warehouses.

IO Characteristics of a Data Warehouse
The usual form of access for a data warehouse in Oracle will be through a bitmap join across the keys stored in the fact table, followed by a specific retrieval of both the related data from the dimension table and the data at the intersection of the keys in the fact table. This outside-to-inside (dimension to fact) access path is called a Star Join. Many times the access to a data warehouse will be through index scans followed by table scans and involve large amounts of scanning, generating large IO profiles. Access will be to several large objects at once: the indexes, the dimensions, and the fact(s) in a DWH system. This access to many items at once will lead to large numbers of input and output operations per second (IOPS). For example, in a 300 gigabyte TPC-H test (TPC-H is a DSS/DWH benchmark) the IOs per Second (IOPS) can exceed 200,000 IOPS. A typical disk drive (15K, 32-148 gigabytes, Fibre Channel) will allow a peak of around 200 random IOPS per second. It is not unusual to see a ratio of provided storage capacity to actual database size of 30-40 to ensure that the required number of IOPS can be reached to satisfy the performance requirements of a large DWH. The IO profile for a 300GB system with temporary tablespace on solid state disks (SSD) is shown in Figure 2.
HD IOPS
100000.00 10000.00 IOPS 1000.00 100.00 10.00 1.00 0 5000 10000 S econds 15000 20000

Perm IO Tem IO p Total IO

Figure 2: IOPS for a 300GB TPC-H on Hard Drives

The same 300GB TPC-H will all tablespaces (data, index, and temporary) on SSD is show in Figure 3.
IO S S P -S D
10 000 0 0.00

10 000 0 .00

1 0 000 .00 IOPS

1 .00 000

100 .00

10 0 .0

1 .00 0 50 0 1 000 150 0 200 0 S onds ec 250 0 30 0 0 3 00 5 4 000 450 0

D I -D D I -T D o l -T ta

Figure 3: IOPS from a 300GB TPC-H on SSD Note that the average IOPS in Figure 2 hover around 1-2000 IOPS with peaking loads (mostly to the SSD temporary file) of close to 10,000 IOPS, while the SSD-based test hovers around 10,000 IOPS with peaking loads nearing 100,000 or more IOPS. The numbers in the charts in Figures 2 and 3 were derived from the GV$FILESTAT and GV$TEMPSTAT views in a 4 node RAC cluster. It was assumed that the raw IOPS as recorded by Oracle underwent a 16 fold reduction due to IO grouping by the HBA and IO interfaces. The hard drive arrays consisted of 2-14 disk 15K RPM 144 GB sets in two RAID5 arrays. The SSD subsystem consisted of a single 1 terabyte RamSan-500, a single 128 GB RamSan-400, and a single 128 GB RamSan-320. To get 100,000 IOPS a modern disk drive based system may require up to 500 disk drives, not allowing for mirroring (RAID). To put this in perspective: to properly spread the IOPS in a 300 GB data warehouse (actually close to 600 GB when indexes are added) you will require 300*40 or 12,000 GB (12 terabytes) of storage to meet IO requirements. At 200 IOPS/disk that maps to 500 disk drives for 100,000 IOPS if no caching or other acceleration technologies are utilized. In actual tests, EMC reached 100,000 IOPS with 495 (3 CX30 cabinets worth) disks in a RAID1 configuration with no mirroring( http://blogs.vmware.com/performance/2008/05/100000-io-opera.html.) The EMC results are shown in Figure 4. Assuming they get linear results by adding disks (and HBAs, cabinets, and controllers) they should be able to get close to 200,000 IOPS with 782 disks, which is pretty close to our 1,000 disk estimate. However, their latency will be pretty close to 15 or more milliseconds if the trend shown in the graph in Figure 4 continues.

Figure 4: EMC IOPS and Latency (From: http://blogs.vmware.com/performance/2008/05/100000-io-opera.html) Since most database systems such as Oracle will use a standard IO profile, increasing the amount of data in the warehouse means we must increase the available IOPS and bandwidth as the database increases in size. Most companies project that their data warehouse will usually double in size within 3 years or less. Of course the amount of IOPS will decide the response time for your queries. If you can afford high response times then your IOPS can be lower; conversely, if you need low response times then your IOPS must be higher. In today’s business environment the sooner you can get answers to the typical DWH queries the sooner you can make strategic business decisions. This leads to the corollary that the DWH should have the highest possible IOPS. Figure 5 shows a comparison between various SAN systems for IO latency.

Figure 5: IOPS Comparison for Various SANs (Source: www.storageperformance.org)

As you can see, the IOPS and latency numbers from an SSD-based SAN are better than those for more expensive hard disk based SANs. Even with disk form-factor SSDs, an SSD system designed from the ground up for performance is still superior. As shown in Figure 6, the latency from the new EMC Flash based drives still cannot compete with SSDs built from the start to perform.

Figure 6: EMC SSD Response time: 1-2 MS EMC HDD Response time: 4-8 MS (Source: ”EMC Tech Talk: Enterprise Flash Drives”, Barry A. Burke, Chief Strategy Officer, Symmetrix Product Group, EMC Storage Division, June 25, 2008)

Processing Characteristics of a Data Warehouse System
DWH systems usually provide summarizations of data, total sales, total uses, or the number of people doing X at a specific point in time and place. This use of aggregation in a DWH system leads to requirements for large amounts of sort memory and temporary tablespace areas, generally speaking. Thus the capability to rapidly sort, summarize, and characterize data is a key component of a data warehouse system. In all TPC-H tests we see the use of large numbers of CPUs, large core memories, and parallel query and partitioning options to allow DWH systems to process the large amounts of data. Most TPC-H tests are run using clustered systems of one type or another. For a 300 GB TPC-H we usually see a minimum of 32 CPUs spread evenly amongst several servers. Technologies such as blade servers offer great flexibility but also tie us to a specific vendor and blade type. In addition, you will eventually be limited by the blade system

enclosure to the capabilities of expansion of the system due to the underlying bus structures of the blade cabinet backplane.

What Have We Found So Far?
So far we have defined the following general characteristics for an ideal DWH system: 1. Large data storage capacity 2. Able to sustain large numbers of IOPS 3. Able to support high degrees of parallel processing (supports large numbers of CPUs) 4. Large core memory for each server/CPU 5. Easily increase data size and IOPS capacity 6. Easily increase processing capability The above requirements call for the following in an Oracle environment: 1. Real Application clusters 2. Partitioning 3. Parallel Query Given the Oracle requirements, the system server requirements would be: 1. 2. 3. 4. Multi-high speed CPU servers Multiple servers High speed interconnect such as Infiniband between the servers Multiple high bandwidth connections to the IO subsystem

Given the IO subsystem requirements, the IO subsystem should: 1. Be easily expandable 2. Provide high numbers of low latency IOPS 3. Provide high bandwidth Notice we haven’t talked about network requirements for a DWH system. Generally speaking DWH systems will have a small number of users in comparison to an online transaction processing system, so a single 1 Gigibit Ethernet type connection is generally sufficient for user access.

Software
Of course, since we are working with Oracle it is assumed that we will stay with Oracle. But for long term planning the idea that we might one day move away from Oracle should be entertained. Therefore our system should support multiple solutions, should the need arise. In the days where processers seemed to be increasing in speed on a daily basis and we were jumping from 8 to 16 to 32 to 64 bit processing the idea of keeping a system

much beyond three years was virtually unheard of, unless it was a large SMP machine or a mainframe. While processors are still increasing in speed, we aren’t seeing the huge leaps we used to. Now we are seeing the core wars. Each manufacturer seems to be placing more and more cores in a single chip footprint. Note the dual and quad core chips already available. Of course it seems as the number of cores on a single chip increase the number of operations that the individual cores can do actually decreases. For example, a single processor chip can do 4 GHz while a dual core may only do 2 GHz per chip. However, software will usually take advantage of the CPUs offered, so as far as the CPUs and their related servers are concerned, usually just by choosing the best high speed processors in a supported platform we can run most any software our operating system will support. Of course disk-based systems used to be fairly generic and only required reformatting or erasure of existing files to be used for a new database system. Now we are seeing the introduction of database specific hardware such as the Exadata cell from Oracle that requires Oracle parallel query software at the cell level in order to operate. Needless to say, using technology that locks you into a specific software vendor may be good for the software vendor but it may not be best in the long run for a company that buys it.

Let’s Build the Ideal System
Let’s start with the servers for the ideal system.

Data Warehouse Servers
We want the system to be flexible as far as upgrades, so while blade systems have a lot to offer, you are locked into a specific blade cabinet and blades so we will use individual rack mount servers instead. The use of individual rack mount servers gives us the flexibility to change our servers without having to re-purchase support hardware such as the blade cabinet and other support blades. The server I suggest is the DELL R905 PowerEdge . The DELL R905 supports 4quadcore Opteron 8393™, 3.1GHz processors, arguably the fastest quadcore chips and all around best processors available for the money. The complete specifications are shown in Appendix A for the suggested dual socket, 16 core configuration, which includes a dual 1 GB NIC and 2 dual channel 4 GB Fibre Channel connections. Also included is a 10Ghz NIC for the Real Application Cluster crossconnect. Since we will want the capability to parallelize our queries, we will also want more than 16 CPUs, so for our ideal configuration I suggest starting at 2 servers, giving us 32 – 3.1 GHz processors. To maximize our SGA sizes for the Oracle instances I suggest the 32 gigabyte memory option with the fastest memory available. With currently available pricing this 2 server configuration will cost just under $35K with 3 years of maintenance.

IO Subsystem
Call me a radical but rather than go the safe route and talk about a mixed environment of disks and solid state storage, I am going to go out on a limb and propose that all active storage be on a mix of Flash and DDR memory devices. We will have disks, but they will

be in the backup system. Figure 7 shows the speed/latency pyramid with Tier Zero at the peak.

Figure 7: The Speed/Latency Pyramid First, let’s look at what needs to be the fastest, lowest latency storage, Tier Zero.

Tier Zero
As a Tier Zero device I propose a RAM based solid state system such as the RamSan-440 to provide storage for temporary tablespaces, redo logs, and undo segments, as well as any indexes that will fit after the write dependent objects have been placed. The prime index candidates for the Tier Zero area would be the bitmap indexes used to support the star or snowflake schema for fast star join processing. I propose a set of 4-RamSan-440s with 512 gigabytes of available storage each in a mirrored set to provide us with a 1 terabyte Tier Zero storage area. At current costs this would run $720K. The RamSan-440 provides up to 600,000 IOPS at .015 millisecond latency. Now let’s move on to the next fastest storage, Tier 1.

Tier 1
Tier 1 storage will hold the data and index areas of the data warehouse. Tier 1 of this ideal system will be placed on Flash memory. A Flash system such as the RamSan-620 will provide up to 5 terabytes of Flash with RAM-based cache in front of it to enhance write performance. We would utilize 2-2TB RamSan-620s in a mirrored configuration. In our 300 gigabytes of data and around 250 gigabytes of indexes configuration this would provide for 2 terabytes of mirrored space to allow for growth and reliability. At current costs this would be $202K (2 TB option with 3 years maintenance and 1 extra dual port FC card).

The RamSan-620 provides 250,000 IOPS with a worst-case 0.25 millisecond read latency. Assuming we could add enough HBAs, we can achieve 2.9 million low latency IOPS from our Tier 0 and Tier 1 systems using the above configuration.

Tier 2
Tier 2 storage would be our backup area. I suggest using compression and de-duplication hardware/software to maximize the amount of backup capability while minimizing the amount of storage needed. The manufacturer that comes to the top of the pile for this type of backup system is DataDomain. The DD120 system would fulfill our current needs for backup on this project system. The list price for the DataDomain DD120 appliance is $12.5K. All of this tier talk is fine, but how are we going to hook it all together?

Switches
As a final component we need to add in some Fibre Channel switches, probably 4-16 channel 4 GB switches to give us redundancy in our pathways. A QLogic SanBox 5600Q provides 16-4GB ports. Four 5600Q’s would give us the needed redundancy and provide the needed number of ports at a cost of around $3,275.00 each for a total outlay on SAN switches of $13.1K. The cost of the XG700 10 gigbit Ethernet 16 port switch from Fujitsu is about $6.5K, so our total outlay for all switches is $19.6K

Final Cost of the Dream Machine
Let’s run up the total bill for the data warehouse dream machine: Servers: RamSan-440: RamSan-620: DataDomain: Switches: Misc. Total 36,484.00 720,000.00 202,000.00 12,500.00 19,600.00 1,500.00 (cables, rack, etc.) 992,084.00

So for $992K we could get a data warehouse system capable of over 2,000,000 IOPS with 32 – 3.1Ghz CPUs, a combined memory capability of 64 gigabytes, and an online available storage capacity of 3 terabytes of low latency storage that is database and (generally speaking) OS agnostic, expandable, and provides its own backup, deduplication, and compression services. Not bad.

What about Oracle?
I am afraid Oracle licenses are a bit confusing, depending on what existing licenses you may have, time of year, where you are located, and how good a negotiator your buyer is.

The best I can do is an estimate based on sources such as TPC-H documents (www.tpc.org). A setup similar to what we have outlined with RAC and Parallel Query will cost about $440K as of June 3, 2009 for three years of maintenance and initial purchase price. I took out the advanced compression option since we really don’t need it. So adding Oracle licenses into the cost brings our total for the system and software to slightly less than 1.5 million dollars ($1,442,084.00).

How does this compare to the Oracle Database Machine?
To match the ideal solutions IOPS would require 741 Exadata cells at 18 mil hardware cost, 44 mil software cost plus support and switches. Since that would be an unfair comparison, we will use data volume instead of IOPS even though it puts the ideal solution at a disadvantage. The Exadata based ODM has a quoted hardware price of $600K; for a full system with license costs it could run a total of anywhere from 2.3 to over 5 million dollars. However, this is for a fully loaded 14-42 terabyte usable capacity machine. We could probably get by with 2 Exadata cells offering between 1-2 terabytes of storage for each cell on the storage side of it. This would provide 2 terabytes of high speed (with the 1TB disks) mirrored Tier Zero and Tier 1 space. We would still need a second set of Exadata cells or some other solution for the backup Tier 2. Each cell with the high-speed disks is only capable of 2700 IOPS so our total IOPS will be 5400. Essentially we would be using a little more than a quarter size ODM, cutting back to 4-8 CPU servers from the full size ODM total of 8 – 8 CPU servers, and only using 4 instead of 14 Exadata cells (2 for system. 2 for backup.) The best price estimates I have seen come from the website: http://www.dbms2.com/2008/09/30/oracle-database-machine-exadata-pricing-part-2/ that have been blessed by various ODM pundits such as Kevin Clossin. Table 1 summarizes the spreadsheet found at the above location. Note that I have added in new costing figures from TPC-H documents which may be lower than posted prices for the per-disk license costs for the Exadata cells and Oracle software. The actual price is somewhere between what I have here and essentially double the license cost per disk for the Exadata cells taking their total from 240K to 480K. The general Oracle software pricing is based on a 12 named user license scheme rather than per processor. The additional cost of support for the Exadata cells is somewhere between $1100-2200 per cell so that also adds an additional $12-24K to the three year cost.
Config
Exadata Server Small DB Server (4 core) Medium DB Server (8 core) Large DB Server (16 core) Partial ODM Full ODM

4 0 4 0

14 0 8 0

Total Server Total Cores

4
32

8
64

1 Exadata Server cost Total Storage cost 1 DB server cost Total DB servers cost Other items Total HW price Software price
Exadata server software per drive Exadata server software per cell Total Exadata server software

24,000 96,000 30,000 120,000 50,000 266,000

24,000 336,000 30,000 240,000 74,000 650,000

5,000 60,000

5,000 60,000

240,000

840,000

Oracle Licenses
Oracle database, enterprise edition RAC option Partitioning option Advanced compression option Tuning pack Diagnostics pack

11,875 5,750 2,875 2,875

Total price per processor After Dualcore discount (50%) Oracle License Cost Total Software price Total System price

11,875 5,750 2,875 2,875 3,5 3,500 00 3,5 3,500 00 30,375 30,375
15,188

$486,016 $725,760 $992,769

$972,03 2

15,188

$1,812,0 32 $2,902, 032

Table 2: ODM Solution Cost Projections So it appears that the Exadata solution will cost less initially ($450K) for fewer total IOPS (5,700 verses 2,000,000) with higher latency (5-7 ms verses .015-.25 ms) and less capable servers. However, some of the latency issues are mitigated by the advanced query software that is resident in each cell. When looking at costing you need to remember that the software support for the Exadata cell software is going to be paid yearly in perpetuity, so you need to add that cost into the overall picture.

Green Considerations
The energy consumption of an Exadata cell is projected to be 900 watts. For 4-cells that works out to 3.6KW compared to 600W for each of the RamSan-440 systems and about 325W for each RamSan-620 for a total of 3.05KW. Over one year of operation the difference in energy and cooling costs could be close to $2K all by themselves. Once all of the license and long term ongoing costs are considered, the ideal solution provides dramatic savings. Note that aggressive marketing techniques from Oracle sales may do much to reduce the hardware and initial licensing costs for the Exadata solution; however, the ongoing costs will still play a significant factor. Expansion of the Exadata solution is by sets of 2 cells for redundancy. The ideal solution can expand to 5 terabytes on each of the RamSan620s by adding cards. Additional sets of RamSan620’s can be added in mirrored sets of 2-2 GB base sets. You must use ASM and RMAN as a backup solution with the Exadata Database Machine. If your projected growth exceeds the basic machine we have proposed then you will have to add in the cost of more Exadata Cells and associated support costs, driving the Exadata solution well above and beyond the RamSan solution in initial and ongoing costs. Remember that with the Exadata solution you must run Oracle11g version 11.1.0.7 at a minimum and for now it is limited to the Oracle supported Linux OS, so you have also given up the flexibility of the first solution.

Score Card
Let’s sum up the comparison between the ideal system and the Oracle Data Warehouse Machine. Look at the chart in Table 2. Consideration OS Flexible DB Flexible Expandable High IO Bandwidth Low Latency High IOPS Initial cost Long term cost Ideal Configuration Yes Yes Yes Yes Yes Yes Higher Good Oracle DWHM No No Yes Yes No No Best Poor

Table 2: Comparison of Ideal with Oracle DWHM From the chart in Table 2 we can see that the ODM is only on par with a few of the total considerations for our ideal system. However, the ODM does offer great flexibility and expandability as long as you stay within the supported OS and with Oracle11g databases.

The ODM also offers better performance than standard hard disk arrays within its limitations.

Summary
In this paper I have shown what I would consider to be the ideal data warehouse system architecture. One thing to remember is that this type of system is a moving target, as technologies and definitions of what a data warehouse is supposed to be change. An ideal architecture allows high IO bandwidth, low latency, capability for high degree of parallel operations, and flexibility as far as future database system and OS are concerned. The savvy system purchaser will weigh all factors before selecting a solution that may block future movement to new OS or databases as they become available.

Appendix A: Server Configuration
PowerEdge R905
Starting Price $20,224 Instant Savings $1,800 Subtotal $18,424
Preliminary Ship Date: 7/30/2009

Date: 7/23/2009 9:36:05 AM Central Standard Time Catalog Number 4 Retail 04 Catalog Number / Description Product Code Qty SKU Id PowerEdge R905:
R905 2x Quad Core Opteron 8393SE, 3.1Ghz, 4x512K Cache, HT3 90531S 1 [224-5686] 1 Additional Processor: Upgrade to Four Quad Core Opteron 8393SE 3.1GHz 4PS31 1 [317-1156] 2 Memory: 32GB Memory, 16X2GB, 667MHz 32G16DD 1 [311-7990] 3 Operating System: Red Hat Enterprise Linux 5.2AP, FI x64, 3yr, Auto-Entitle, Lic & Media R52AP3 1 [420-9802] 11 Backplane: 1X8 SAS Backplane, for 2.5 Inch SAS Hard Drives only, PowerEdge R905 1X825HD 1 [341-6184] 18 External RAID Controllers: Internal PERC RAID Controller, 2 Hard Drives in RAID 1 config PRCR1 1 [341-6175][341-6176] 27 Primary Hard Drive: 73GB 10K RPM Serial-Attach SCSI 3Gbps 2.5-in HotPlug Hard Drive 73A1025 1 [341-6095] 8 2nd Hard Drive: 73GB 10K RPM Serial-Attach SCSI 3Gbps 2.5-in HotPlug Hard Drive 73A1025 1 [341-6095] 23 Rack Rails: Dell Versa Rails for use in Third Party Racks, Round Hole VRSRAIL 1 [310-6378] 28 Bezel: PowerEdge R905 Active Bezel BEZEL 1 [313-6069] 17 Power Cords: 2x Power Cord, NEMA 5-15P to C14, 15 amp, wall 2WL10FT 1 [310-8509][310-38 snCFG6 plug, 10 feet / 3 meter 8509] Integrated Network Adapters: 4x Broadcom® NetXtreme II 5708 1GbE Onboard NICs with TOE 4B5708 1 [430-2713] 41 Optional Feature Upgrades for Integrated NIC Ports: LOM NICs are TOE, iSCSI Ready (R905/805) ISCSI 1 [311-8713] 6 Optional Network Card Upgrades: Intel PRO 10GbE SR-XFP Single Port NIC, PCIe-8 10GSR 1 [430-2685] 613 Optional Optical Drive: DVD-ROM Drive, Internal DVD 1 [313-5884] 16 Documentation: Electronic System Documentation, OpenManage DVD Kit with DMC EDOCSD 1 [330-0242][330-5280] 21 Hardware Support Services: 3Yr Basic Hardware Warranty Repair: 5x10 HWOnly, 5x10 NBD Onsite U3OS 1 [988-0072][988-4210][990-5809] [990-6017][990-6038] 29

Appendix B: RamSan 440 Specs RamSan-440 Details
RamSan-440 highlights:

• • • • • • • •

The World's Fastest Storage® Over 600,000 random I/Os per second 4500 MB/s random sustained external throughput Full array of hardware redundancy to ensure availability IBM Chipkill technology protects against memory errors up to and including loss of a memory chip. RAIDed RAM boards protect against the loss of an entire memory board. Exclusive Active Backup® software constantly backs up data without any performance degradation. Other SSDs only begin to backup data after power is lost. Patented IO2 (Instant-On Input Output) software allows data to be accessed during a recovery. Customers no longer have to wait for a restore to be completed before accessing their data.

I/Os Per Second 600,000 Capacity 256-512 GB Bandwidth 4500 MB per second Latency Less than 15 microseconds Fibre Channel Connection

• • • •

4-Gigabit Fibre Channel (2-Gigabit capable) 2 ports standard; up to 8 ports available Supports point-to-point, arbitrated loop, and switched fabric topologies Interoperable with Fibre Channel Host Bus Adaptors, switches, and operating systems

Management

• • • • •

Browser-enabled system monitoring, management, and configuration SNMP supported Telnet management capability Front panel displays system status and provides basic management functionality Optional Email home feature

LUN Support

• • •

1 to 1024 LUNs with variable capacity per LUN Flexible assignment of LUNs to ports Hardware LUN masking

Data Retention

• • •

Non-volatile solid state disk Redundant internal batteries (N+1) power the system for 25 minutes after power loss Automatically backs up data to Flash memory modules at 1.4 GB/sec

Reliability and Availability

• •

Chipkill technology protects data against memory errors up to and including loss of an entire memory chip Internal redundancies o Power supplies and fans o Backup battery power (N+1) o RAIDed RAM Boards (RAID 3) o Flash Memory modules (RAID 3) Hot-swappable components o Five Flash Memory modules (front access) o Power supplies Active BackupTM o Active BackupTM mode (optional) backs up data constantly to internal redundant Flash Memory modules without impacting system performance making shutdown time significantly shorter. IO2 o IO2 allows instant access to data when power is restored to the unit and while data is synced from Flash backup. Soft Error Scrubbing o When a single bit error occurs on a read, the RamSan will automatically re-write the data to memory thus scrubbing soft errors. Following the re-write the system re-reads to verify the data is corrected.

• •

• •

Backup Procedures Supports two backup modes that are configurable per system or per LUN:

• •

Data Sync mode synchronizes data to redundant internal Flash Memory modules before shutdown or with power loss Active BackupTM mode (optional) - backs up data constantly to internal redundant Flash Memory modules without impacting system performance.

Size 7” (4U) x 24”

Power Consumption (peak) 650 Watts Weight (maximum) 90 lbs

Appendix C: RamSan620 Specifications

RamSan-620 highlights: • 2-5 TB SLC Flash storage • 250,000 IOPS random sustained throughput • 3 GB/s random sustained throughput • 325 watts power consumption • Lower cost Features • A Complete Flash storage system in a 2U rack • Low overhead, low power • High performance and high IOPS, bandwidth, and capacity • Standard management capabilities • Two Flash ECC correction levels • Super Capacitors for orderly power down • Easy installation • Fibre Channel or Infiniband connectivity • Low initial cost of ownership • Easy incremental addition of performance and capacity I/Os Per Second: 250,000 read and write Capacity : 2-5 TB of SLC Flash Bandwidth: 3 GB per second Latency Writes: 80 microseconds Reads: 250 microseconds Fibre Channel Connection • 4-Gigabit Fibre Channel • 2 ports standard; up to 8 ports available • Supports point-to-point and switched fabric topologies • Interoperable with Fibre Channel Host Bus Adaptors, switches, and operating systems Management • Browser-enabled system monitoring, management, and configuration • SNMP supported • Telnet management capability • SSH management capability • Front panel displays system status and provides basic management functionality LUN Support • 1 to 1024 LUNs with variable capacity per LUN • Flexible assignment of LUNs to ports Data Retention

• Completely nonvolatile solid state disk • Reliability and Availability • Flash Layer 1: ECC (chip) • Flash Layer 2: board-level RAID Internal redundancies - Power supplies and fans Hot-swappable components - Power supplies Size : 3.5" (2U) X 18" Power Consumption (peak) : 325 Watts Weight (maximum): 35 lbs

Appendix D: DataDomain DD120 Specifications
Remote Office Data Protection > High-speed, inline deduplication storage > 10-30x data reduction average > Reliable backup and rapid recovery > Extended disk-based retention > Eliminate tape at remote sites > Includes Data Domain Replicator software Easy Integration > Supports leading backup and archive applications from: Symantec EMC HP IBM Microsoft CommVault Atempo BakBone Computer Associates > Supports leading enterprise applications including: > Database: Oracle, SAP, DB2 > Email: Microsoft Exchange > Virtual environments: VMware > Simultaneous use of NAS and Symantec OpenStorage (OST) Multi-Site Disaster Recovery > 99% bandwidth reduction > Consolidate remote office backups > Flexible replication topologies > Replicate to larger Data Domain systems at central site > Multi-site tape consolidation > Cost-efficient disaster recovery Ultra-Safe Storage for Reliable Recovery > Data Invulnerability Architecture > Continuous recovery verification, fault detection and healing Operational Simplicity > Lower administrative costs > Power and cooling efficiencies for green operation > Reduced hardware footprint > Supports any combination of nearline applications in a single system
SPECIFICATIONS DD120
Capacity: Raw 3 750 GB Logical Capacity: Standard 1, 3 7 TB Logical Capacity: Redundant 2, 3 18 TB Maximum Throughput 150 GB/hr Power Dissipation 257 W Cooling Requirement 876 BTU/hr System Weight 23 lbs (11 kg) System Dimensions (WxDxH) 16.92 x 25.51 x 1.7 inches (43 x 64.8 x 4.3 cm) without rack mounting ears and bezel.

19 x 27.25 x 1.7 inches (48.3 x 69.2 x 4.3 cm) with rack mounting ears and bezel. Minimum Clearances Front, with Bezel: 1” (2.5 cm) Rear: 5” (12.7 cm) Operating Current 115VAC/230VAC 2.2/1.1 Amps System Thermal Rating 876 BTU/hr Operating Temperature 5°C to 35°C (41°F to 95°F) Operating Humidity 20% to 80%, non-condensing Non-operating (Transportation) Temperature -40°C to +65°C (-40°F to +149°F) Operating Acoustic Noise Max 7.0 BA, at typical office ambient temperature (23 +/- 2° C) REGULATORY APPROVALS Safety: UL 60950-1, CSA 60950-1, EN 60950-1, IEC 60950-1, SABS, GOST, IRAM Emissions: FCC Class A, EN 55022, CISPR 22, VCCI, BSMI, RRL Immunity: EN 55024, CISPR 24 Power Line Harmonics: EN 610003-2

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close