BY: Jim Hawkins, MPE/iX Lab Introduction: In order to provide HP e3000 customers with more long term options for Disk Storage the MPE/iX Lab has created a group of patches for the MPE/iX 7.5 Release to allow the use of disk modules larger than the current 300 Gigabyte maximum. Collectively these patches will be referred to as the "Large Disk patches" and the term "Large Disk" will be used as shorthand throughout this article. With the Large Disk patches MPE/iX can support the attachment and configuration of any sized SCSI-2 compliant disk module and will allow MPE/iX OS access to up to 512 Gigabytes of storage per disk module. Additionally, Large Disk patches include enhancements to MPE/iX CI Commands and Utilities to more consistently manage Groups and Accounts with greater than 99,999,999 sectors (~24 GB). Who should consider installing these patches? 1) Customers who are, or soon will be, using 146 GB and 300 GB Disk modules. 2) Customers who currently have, or soon will have, any MPE Groups or Accounts with a total disk space usage of more than 100,000,000 sectors (approximately 24 GB). The remainder of this article is broken down into the follow subject areas: Large Disk Project Background Large Disk Features in Detail Large Disk Patches and Patch Details Large Disk Usage Guidelines Large Disk Project Background: This effort was undertaking as the result of two independent but related driving forces: Internal to HP, it was observed that HP Disk Module sizes have been doubling in size approximately every 2 years: 4 GB, 9, 18, 36, 73, 146, 300 GB. External to HP, members of Open MPE, Inc. observed this same trend and drove the inclusion of an item requesting "Support future large disks" in the Interex 2003 SIB. Additionally as our investigation proceeded other items came to light which underlined the need for some work to be done to support large disk configurations. The last major initiative to address disks size was done in MPE XL 4.0 for support of disks larger than 4 GB. These changes were done to address an approximately ten times (10x) increase in disk from 404-670 MB to 4.0 GB disks. In 2005 with MPE/iX 7.5 we were confronted with nearly a hundred times (100x) size change (4.0 GB to >300 GB) over what had been possible in MPE XL 4.0. The MPE/iX 6.5 "Large File" enhancement allowed bigger Files, and more disk space in each MPE Group and Account. However it was found that several CI commands and Utilities were limited in their ability to work with the resulting larger Groups and Accounts. All of these inputs were assessed during the Large Disk investigation and as many as possible were addressed by the Large Disk patches. So what does Large Disk deliver? Large Disk Features in Detail: The fundamental goal of Large Disk is to support disks larger than current supported maximum size (300 GB). This requires a number of changes to MPE/iX. The following sections outline the features that the Large Disk patches include as well as a list of features which an MPE/iX knowledgeable person might speculate would be included but are not. To start, the Large Disk patches intentionally provide the following enhancements to MPE/iX 7.5: Large Disk includes the ability to attach and use SYSGEN to configure any sized SCSI-2 compliant Disk. MPE/iX uses SCSI-2 protocol to connect to SE, HVD and LVD SCSI Disks as well as Fibre Channel over SCSI. The SCSI-2 standard allows for disks of up to 2 Terabytes. SCSI-3 disks may be larger but will only report up to 2 Terabytes of storage for SCSI-2 format inquiries. Large Disk includes the ability to initialize an MPE/iX Disk Volume of up to 512 Gigabytes on SCSI-2 compliant disks. SCSI-2 Disks that are larger than 512 GB will be "truncated" at the 512 GB limit and the space beyond 512 GB will not be accessible or usable by the MPE/iX Operating System or any user applications running under MPE. Note that an MPE/iX Disk Volume includes OS disk resident data structures which do use some disk space so no more than 511 GB of "User File" space should be expected per disk volume. Large Disk includes a number of opportunistic enhancements to MPE Command Interpreter commands and utility programs to 'smooth' user experience when dealing with large disks, large Groups and large Accounts. These commands and utilities are: :REPORT, :[ALT|LIST|NEW][GROUP|ACCT], FSCHECK, DISCFREE. More details on the changes for each of these will be provided later in this document. The items below are provided to describe features that are NOT part of Large Disk in anticipation of possible questions: Large Disk does not allow the usage of more than 512 GB of disk space on any configured LDEV. Why? This has a similar root cause as the pre-7.5 limitation on LDEV 1. In the LDEV 1 case some hardware and software components were limited to accessing disk objects with 32 bit byte pointers, given an upper address limit of 2^32 Bytes or 4 GB. For Large Disk support it was found that a number of MPE/iX programs and subsystems were limited to a 31 bit "Sector" offsets (Pascal/iX Integer variables). Assuming the MPE standard sector size of 256 Byte, these modules are then limited to addressing a maximum of 512 GB. Changes to these modules to allow greater than 31 bit sector offsets or to MPE to define greater than 256 Byte sectors were deemed too complex to pursue through the MPE/iX Patch Process. (follow-up question) If the biggest disk that HP sells is 300 GB then how to you know that a larger than 512 GB disk can be configured and safely used? Most disks on the HP e3000 use the same "scsi_disk_and_array_dm" driver - this includes Fibre Channel disk arrays. We were able to configure HP Storage Works VA7xxx Disk Arrays to export Logical Devices of 520 and 522 GB. As part of our testing we included these >512 GB volumes in MPE/iX Volume Sets which also contained multiple 300 GB and 146 GB disks to test very large MPE Groups and Accounts. (Note: this is a very unorthodox configuration for a VA connected to an HP e3000 - we still recommend that VA LUNs created for use by MPE/iX be between 4 and 10 GB in size.) Large Disk does not change the number of files per disk volume. The current limit, established in MPE XL 4.0, remains at ~250,000 files. The number of files per disk is dependent upon on the way that files are built and the amount of fragmentation on a disk, it is possible therefore to receive "out of disk space" errors for a smaller number of files before DISCFREE reports 100% utilization. Changing the number of files per volume was judged too complex to pursue through the MPE/iX Patch Process and might have forced changes to MPE/iX disc resident structures. If you need more than 250,000 files in a MPE/iX Volume Set then you will need to spread these files across multiple disks. Large Disk does not change the list of currently supported devices for any MPE/iX release nor does the existence of the Large Disk patches guarantee support for any specific (future) HP disk modules. Why? We cannot certify a disk that doesn't exist! While Large Disk provides the potential to use >300 GB disk modules, certification of such modules will need to be done through normal HP processes and may require the installation of additional patches. Large Disk Patches: In order to keep patches small and to reduce future patch complexity we made the decision to create a number of independently installable patches for the Large Disk enhancement. There are no "hard" patch dependencies between these patches and they may be installed separately. However, HP strongly advises installing all of these patches at the same time using Patch/iX. The Large Disk Patches (sorted by Patch ID in alpha numeric order) are: MPEMXT1 FSCHECK.MPEXL.TELESUP MPEMXT2 [ALT|LIST|NEW][ACCT|GROUP] MPEMXT3 SCSI Disk Driver Update MPEMXT4 SSM Optimization (>87 GB) MPEMXT7 DISCFREE.PUB.SYS MPEMXU3 REPORT MPEMXU6 CATALOG.PUB.SYS MPEMXU7 CIERR.PUB.SYS, CICATERR.PUB.SYS Details of each patch are provided below broken down in to three groups: "I/O Drivers and SSM fixes", "CI Commands enhancements", "Utilities updates". Large Disk Patch Details: I/O Drivers and SSM fixes. MPEMXT3 ( 8606363192 ) limits the reported size of a SCSI Disk to 512 GB. Without this patch MPE Volume management will not allow you to do a NEWVOL or NEWSET command on a disk larger than 512 GB. Changes are made to both the "scsi_disc_dm" disk driver used for (old) SE-SCSI disks and the "scsi_disc_and_array_dm" used for all differential SCSI (HVD & LVD) as well as Fibre Channel disks. MPEMXT4 ( 8606391171 , 8606340906 ) includes changes to more effectively utilize the disk space on large disks. Without this patch MPE/iX may not properly spread extents between members of a multi-volume Volume Set. Changes are also made to thresholds used to decide which disk to send data to if they both had "equal" utilization; previously increments of 1 part in a 100 were used, now we use 1 part in 10,000. Finally, a bug fix to correct a problem with DISCFREE.PUB.SYS reporting of LDEV 1 disc utilization was made. Large Disk Patch Details: CI Command Enhancements. MPEMXT2 ( 8606363187 ) enhances the CI Commands ALTACCT, ALTGROUP, LISTACCT, LISTGROUP, NEWACCT, and NEWGROUP. o ALTACCT, ALTGROUP, NEWACCT, NEWGROUP: One may now use the "ALT" and "NEW" commands to specify the maximum size of a Group or Account from 1 (256 Byte) sector up to 2,147,483,648 (4K) Pages, which is 8 Terabytes (this limit was changed in 6.5 Large Files). Group and Account limits are set using the parameter FILES=filespace as defined in the new HELP text for these commands: filespace Disk storage limit, in sectors or pages, for the permanent files of the account. A positive value represents the limit in sectors and a negative value represents the limit in pages (4096 bytes). The filespace limit cannot be less than the number of sectors or pages currently in use for the account. Default is unlimited file space, which may be specified by omitting the ;FILES parameter, or by specifying ;FILES=[Return]. If the ;FILES parameter is 2,147,483,647, then the account will be modified to use unlimited filespace. o LISTACCT and LISTGROUP with the "FORMAT=DETAIL" option will display "DISC LIMIT and "DISC SPACE" in the form nnnn(SECTORS) or (PAGES)mmm depending upon the value supplied in the FILES= parameter of the "NEW" or latest "ALT" command for the Group or Account. "FORMAT=DETAIL" is new for LISTGROUP and enhanced for LISTACCT) The FORMAT=DEFAULT version of these commands will display "**" for "DISC LIMIT" or DISC SPACE" that is less than zero, that is, for a page count. (This to prevent scripting problems where a sector count value and a page count value might be mistakenly added together). For example in the follow output the Account named "BIGACCT1" has a DISC LIMIT value which was initialized as a negative page count value. By default LISTACCT will show "DISC LIMIT: **", while with the "FORMAT=DETAIL" "DISC LIMIT : (PAGES)2147483648" will be shown. CSYLE4:listacct bigacct1 ******************** ACCOUNT: BIGACCT1 DISC SPACE: 0(SECTORS) PASSWORD: ** CPU TIME : 25171(SECONDS) LOC ATTR: $00000000 CONNECT TIME: 0(MINUTES) SECURITY--READ : AC DISC LIMIT: ** WRITE : AC CPU LIMIT : UNLIMITED APPEND : AC CONNECT LIMIT: UNLIMITED LOCK : AC MAX PRI : 150 EXECUTE : AC GRP UFID : $05650001 $1C8A3CE2 $00D260D2 $1B002457 $60936B53 USER UFID: $00000000 $00000000 $00000000 $00000000 $00000000 CAP: AM,AL,GL,ND,SF,BA,IA CSYLE4:listacct bigacct1;format=detail ******************** ACCOUNT : BIGACCT1 PASSWORD : ** GID : 208 DISC SPACE : 0(SECTORS) CPU TIME : 25171(SECONDS) CONNECT TIME : 0(MINUTES) DISC LIMIT : (PAGES)2147483648 CPU LIMIT : UNLIMITED CONNECT LIMIT : UNLIMITED MAX PRI : 150 LOC ATTR : $00000000 SECURITY : R:AC ; A:AC ; W:AC ; L:AC ; X:AC GRP UFID : $05650001 $1C8A3CE2 $00D260D2 $1B002457 $60936B53 USER UFID : $00000000 $00000000 $00000000 $00000000 $00000000 CAP : AM,AL,GL,ND,SF,BA,IA NOTE: MPEMXT2 is superseded by patch MPEMXU3, that is, if you install MPEMXU3 you will also receive the MPEMXT2 changes. MPEMXU3 ( 8606127582 ) changes the REPORT command to allow it to display larger numbers that previously was possible. This is accomplished by adding a new FORMAT parameter as defined in the following HELP text: FORMAT Select one of the following output formats, with DEFAULT being the provided format, which may be specified by omitting the ;FORMAT parameter, or by specifying ;FORMAT=[RETURN]. Both formats display the current counts and limits for filespace usage, CPU usage and connect time usage for each group and account displayed. File space is reported in sectors. CPU time is shown in seconds, and connect time is displayed in minutes. DEFAULT This format replaces values exceeding 99,999,999 with "**" to indicate the number is greater than what can be displayed in the column width. "**" is also displayed if the limit value is greater than 99,999,999 and when there is no configured limit. LONG This format supports value upto, but not exceeding, the internal directory limits. The limit for filespace usage is 34,359,738,368 (8 TB). The limit for CPU usage seconds and connect time minutes is 2,147,483,647. These internal limits can be exceeded but REPORT is unable to display values greater than the limit values above. Accounts and groups without configured limits display "UNLIMITED" rather than "**" as in the DEFAULT format. Example output from REPORT "DEFAULT" and "LONG" follows: CSYLE4:report @.bigacct1 ACCOUNT FILESPACE-SECTORS CPU-SECONDS CONNECT-MINUTES /GROUP COUNT LIMIT COUNT LIMIT COUNT LIMIT BIGACCT1 0 ** 25171 ** 0 ** /GROUP 0 ** 0 ** 0 ** */GROUP1 ** ** 25171 ** 0 ** /PUB 0 5174912 0 ** 0 ** CSYLE4:report @.bigacct1;format=long ACCOUNT FILESPACE-SECTORS CPU-SECONDS CONNECT-MINUTES /GROUP COUNT LIMIT COUNT LIMIT COUNT LIMIT BIGACCT1 0 34359738368 25171 UNLIMITED 0 UNLIMITED /GROUP 0 34359738368 0 UNLIMITED 0 UNLIMITED */GROUP1 3832290720 34359738368 25171 UNLIMITED 0 UNLIMITED /PUB 0 5174912 0 UNLIMITED 0 UNLIMITED MPEMXU6 ( 8606403821 ) changes CATALOG.PUB.SYS to include error messages for the CI Commands enhanced by other Large Disk patches. The other Large Disk CI Command patches will work without this patch but may display confusing error messages. MPEMXU7 ( 8606407668 ) changes CIERR.PUB.SYS and CICATERR.PUB.SYS to include HELP text for the CI Commands enhanced by other Large Disk patches. The other Large Disk CI Command patches will work without this patch but may display confusing error messages. Large Disk Patch Details: Utilities Updates MPEMXT1 ( 8606363191 , 8606389264 ) fixes several problems with the utility program FSCHECK.MPEXL.TELESUP. Without this patch FCHECK may not work well with Groups and Accounts that contain more than 2,147,483,647 sectors, or 512 GB. Specifically o SYNCACCOUNTING - This function has been enhanced to ensure that Group and Account totals can be synchronized up to the 8 TB "Large File" limit. Additionally, HFS directory objects should now be properly accounted for. o TOTALEXTENTS - Enhanced to allow totals up to the 8 TB "Large File" limit. MPEMXT7 ( 8606166738 , 8606340906 ) fixes issues with the utility program DISCFREE.PUB.SYS. The primary change is to slightly increase the column width for program output to ensure the ability to display sector counts up to the 512 GB limit per disk and to allow display of "Totals" information for ~1000 such disks. Command scripts or programs parsing the output of DISCFREE using absolute column offsets will need to be changed -- those using "white space" parsing techniques to extract information should be unaffected. Additionally, with the MPEMXT4 patch in place DISCFREE should always provide proper disk totals for LDEV #1. Large Disk Usage Guidelines: Even with the Large Disk Patches one should be cautious when considering the usage of disks larger than 18-36 GB on MPE/iX systems for the following reasons: MPE/iX transaction through-put increases when MPE is allowed to spread I/O across disks. Even though newer disks are faster than older disks there are limits to disk speed and bus speed which must be taken into account. Moving from say nine 2 GB disks to one 18 GB disk will often create a Disk I/O bottle neck. For best performance we recommend that the number of MPE LDEVs never be reduced - if one has nine 2 GB disks then they should be replaced with nine 18 GB disks to ensure no loss of throughput. The HP Supplied back-up options for large amounts of data are limited. Most HP e3000 systems were shipped with DDS-2 or DDS-3 (DAT24). The DAT24 is typically rated 7.2 GB/hr assuming 2:1 compression of data. Even a DLT 80 device is only capable of 42 GB/hr at 2:1 compression. It should be obvious that neither of these devices can easily deal with several full 300 GB modules. Therefore, as systems exceed 100 GB of data, alternatives to HP Tape back-up should be considered. These might include some combination of the following solutions: HA Disk Arrays, "Store-to-Disk" and/or third party back-up (networked or tape) or data replication products. Mirrored Disk/iX is generally recommended for systems where the TOTAL size of all mirrored volumes is less than 50-75 GB (http://jazz.external.hp.com/mpeha/papers/off_white_2004.html, and http://jazz.external.hp.com/mpeha/papers/malicious.html). Why? Creation and repair/recovery of mirror disks can take many hours even on reasonably fast systems and even for an "empty" disk. For example, a K-Class 4x100Mhz System using NIO F/W SCSI takes 15.75 hours to synchronize one pair of 146 GB Disks when the system was otherwise completely idle and DISCFREE reported 0% full for the mirrored disk volume. This means that any data written to the volumes was not fully protected for nearly 16 hours! Additionally, Mirrored Disk/iX may only repair 6 disks pairs at a time so the number of pairs as well as the total disk space to be mirrored should be made as small as possible. In conclusion, disks larger than 18-36 GB are "unsuitable" for use with Mirrored Disk/iX under most circumstances. Appendix: How big is a Kilobyte? For the purposes of this article the traditional "engineering" or "powers of two" definitions were used for most calculations of disk space. This differs slightly from newer industry standard "Marketing" figures which typically use a "powers of ten" definition of these values: (Term) "Engineering" Definition "Marketing" Definition 1 Kilobyte 2^10 = 1024 Bytes 10^3 = 1,000 Bytes 1 Megabyte 2^20 = 1,048,576 Bytes 10^6 = 1,000,000 Bytes 1 Gigabyte 2^30 = 1,073,741,824 Bytes 10^9 = 1,000,000,000 Bytes 1 Terabyte 2^40 = 1,099,511,627,776 Bytes 10^12 = 1,000,000,000,000 Bytes For example an HP 300 GB module will be shown by DISCFREE to contain 1171874992 Sectors which is 299,999,997,952 Bytes. Thus, this disk is a 300 GB Disk for "Marketing" purposes but only ~286 GB for "Engineering" calculations in this article. So how big is the largest disk MPE can support, really? The maximum disk size for MPE/IX is theoretically 2^31 sectors. Due to overhead and rounding DISCFREE output will show 2,147,483,632 sectors for such a disk, this is equal to 549,755,809,792 bytes. So, a disk of this size would likely be sold as a 550 GB disk (powers of ten) though it contains 512 GB from an engineering perspective (powers of two).