Saturday, November 17, 2012

Lustre notes

nema5pic
vi-hell
clippie

Lustre

A high performance file system that is guaranteed to make you buy Excedrin(TM) by the box load :) -- high performance, if you can get it to work.

Client side

Settings

/etc/fstab ::
10.46.8.150@tcp0:10.46.8.151@tcp0:/lstr2  /mnt/lustre  lustre  defaults,_netdev  0 0


/etc/modprobe.conf ::
options ksocklnd peer_timeout=0
options lnet networks=tcp0(eth0)
# (2nd line is being tested for statbility) :: 
# use eth1 if cluster node is multi-homed, 
# pick nic on same network as storage where mount takes place

Recommended settings

TCP settings, set before mounting lustre
echo 16777216 > /proc/sys/net/core/wmem_default 
echo 16777216 > /proc/sys/net/core/wmem_max 
echo 16777216 > /proc/sys/net/core/rmem_default 
echo 16777216 > /proc/sys/net/core/rmem_max
echo 4096 87380 16777216 > /proc/sys/net/ipv4/tcp_rmem 
echo 4096 87380 16777216 > /proc/sys/net/ipv4/tcp_wmem 
echo 30000 > /proc/sys/net/core/netdev_max_backlog
ifconfig eth1 txqueuelen ${TXQLEN:-40}
exit 0
Lustre settings, set after mounting
for i in /proc/fs/lustre/osc/*/checksums; do echo 0 > $i; done
for i in /proc/fs/lustre/osc/*/max_dirty_mb; do echo 512 > $i; done
for i in /proc/fs/lustre/osc/*/max_rpc*; do echo 32 > $i; done

Commands

lfs df -h
lctl

Linux Performance Cmd

sudo ethtool eth1		# see nic speed, duplex


# ssh to any compute node (lustre client)
time -p tcpclient 10.46.8.151 50000 100000 -10000

# tcpclient binary optained from lustre server.

Performance/Tuning

Lustre is a cluster-based file system. Striping across multiple servers and disk store is what get most performance. Typically setup is done so that each file stay w/in a single stripe, but then as different files are created, they will end up in different stripe (server).
But for really large file and it is accessed by many clients, best performance would be have if such file is actually stripped. To do this, use the lfs setstripe command.
To simplify life, settings are done per directory basis, but lustre apply to files created therein.
Method Default

# sequentially create files that are 1G each.
# each file will land on different stripe
# (but each file in one stripe)

mkdir test-default
cd    test-default
for i in 0 1 2 3 4 5 6 7 ; do
  time -p dd if=/dev/zero of=$i.tmp bs=1024k count=1024
done

lfs getstripe * 
# will see that diff file have diff obdidx, 
# which means they are on diff OST servers

# note that the first number is tremendously skewed by caching
# using smaller file size will get even more caching and less indicative of sustained performance
# 1 gbps NIC, wirespeed is about 125 MB/s

Method 1: stripe w/in single file

# sequentially create files that are 1G each.
# each file will land on different stripe
# and within each file they will have diff stripe

# setstripe command does this
# -s NUMBER 	: for stripe size, 4194340 = 4MB
# -i -1 	: index of OST where files would go, -1 means any
# -c -1		: client number?  -1 = any random.   good for up to 8 OST, but as OST are added, especialy in imbalance, may need to tweak this.

mkdir test-file-stripe
#lfs   setstripe test-file-stripe -s 4194304 -i -1 -c -1
cd    test-file-stripe
for i in 0 1 2 3 4 5 6 7 ; do
  time -p dd if=/dev/zero of=$i.tmp bs=1024k count=1024
done

lfs getstripe *
# will see that each file has multiple odbidx number
# indicating striping w/in a single file

Method 2: Geek testing
Terascala created some script that create a series of directories at the top level of the lustre file system, where by diff dir would mean writting data with differnt stripping patters.

# login to lustre server (eg mds1)
cd /root/bin/misc/clients
./setup_directories


The above will result in dirs that are:

/mnt/lustre/ost
	ost-00      ost-01 ... ost-0n
	a series of ost dir, where files written to any of these dir will stay within each ost.
	This is good to benchmark to ensure all OST are performing equally.
	Test as root:
		for i in  /mnt/lustre/ost/ost* ; do
			time -p dd if=/dev/zero of=$i/dummy-file.tmp bs=1024k count=1024
		done


	lfs getstripe /mnt/lustre/ost/ost-00
	# will show obdidx all pointing to the same number 
	# (within each ost-0n dir)

/mnt/lustre/strip-xM
	# where x indicate stripe sizes. 1M maybe too small for optimal performance
	# 4M would be good starting point
	# different size dir are created so that 
	# performance against each of those can be done quickly



lfs getstripe *

compiling driver

kernel version specific. It would be nice if they use DKMS so that the lustre module is automatically recompiled each time a new kernel is installed on the system.

patched kernel req on the server side (ie MDS, OST), but not needed on client side (ie compute node mounting lustre fs).

To compile lustre module, the 'quilt' tool is helpful.



Server side

Config

  1. MDS are metadata server, typically one is enough, as it is only to check file path info. 1 second standby maybe useful in HA. lustre ost are not cluster-based, thus the other node is standby and not active-active ??
  2. OST: these are the file/data block servers. As many as needed to control the disks shelves and provide performance. Terascala make these in HA active/standby pair also on Dell hardware.

start up

/etc/init.d/tslagent start

cmd

ts 		# similar to tentakle , run command on all server nodes

for i in 1 2 3 4 5 6; do ts $i echo ======= \; hostname\; mount ; done


Ref/Links

  • http://wiki.whamcloud.com/display/PUB/Documentation





Copyright info about this work

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike2.5 License. Pocket Sys Admin Survival Guide: for content that I wrote, (CC) some rights reserved. 2005,2012 Tin Ho [ tin6150 (at) gmail.com ]
Some contents are "cached" here for easy reference. Sources include man pages, vendor documents, online references, discussion groups, etc. Copyright of those are obviously those of the vendor and original authors. I am merely caching them here for quick reference and avoid broken URL problems.



Where is PSG hosted these days?

http://dl.dropbox.com/u/31360775/psg/psg2.html This new home page at dropbox
http://tiny.cc/tin6150/ New home in 2011.06.
http://tiny.cc/tin6150a/ Alt home in 2011.06.
http://unixville.com/~sn/psg/psg.html (won't be coming after all)
ftp://sn.is-a-geek.com/psg/psg.html My home "server". Up sporadically.
http://www.cs.fiu.edu/~tho01/psg/psg.html (no longer updated as of 2007-06)
http://www.fiu.edu/~tho01/psg/psg.html (no longer updated as of 2007-05)

Sunday, November 11, 2012

LDAP

LDAP

LDAP is a user database, and it does in Unix what MS Active Directory does in Windows. From Operating System Admin perspective, LDAP would typically be used to replace NIS (or NIS+ or scattered local passwd files). If you are pondering if you should migrate to LDAP, the simple answer is most likely going to be a 'NO' :)
LDAP is much more complex than NIS, and OS support is still not rock solid as in NIS. Automount maps retrieval from LDAP is quite complex and not well supported. Older OS LDAP support is rather spotty, and forget about OS that have been EOL-ed by the vendor!
So, in short, unless you know you need LDAP, stay away from it :) And if you do use it, hopefully this page has some useful info for you.

LDAP Tree Conceptualization

While LDAP is a user "database" repository, do not think of it as a Relational Database. It has no resemblance to RDBMS. The only thing that bear resemblance from RDBMS is the "key" and maybe "attributes", other than that, forget everything you learned about Relational DB when working with LDAP. Definately forget about those Boyce-Codd Normal Form!!
File Cabinet Numbering
...
Special File System Tree
...

LDAP User Management

To disable/lock out a user, one can add the attribute to the user dn: nsAccountLock: true. The GUI console provide a way to click "disable" under the Accounting tab. However, OS support for this flag is spotty. For example, RHEL 3 does, but Solaris 8 does not. Thus, the admin should change the user default shell to something like /bin/false.

The Unix sys admin tradition is to also change the user's password to some random text to prevent the user from logging in. Sure, theoretically, there is a chance the password could still be guessed, but this isn't likely in practice, and may involve non-print characters. Finally, note that if POSIX account is completely disabled, it would delete the UID info, which are typically undesirable due to the need to keep UID/GID for historical reference (files ownership, etc).

LDAP Profile Management

Some OS need a profile to be stored on the LDAP server so that client side LDAP config can be updated in a single server location, rather than updating on each machine. In practice, profile may add too much complexity for too little gain.

Sun also claims profile add security, preventing user from joining a machine to LDAP w/o the LDAP admin knowledge. However, if the user have root in one client, files in /var/ldap can be copied from one machine to another. The encrypted password is copied, obliviating the need to know any password to bind to the Directory. Lastly, most site allows anonymous bind just as ypcat passwd is not secured anyway. At the end of the day, the sys admin don't have a choice whether to use profiles or not. Solaris and HPUX use profiles, and AIX and linux don't.

Some admins like to create one profile for each version of a given OS, so that vendor gratuitous changes in new version won't affect other versions. This come at a cost of maintaining more profiles. Pick your own poison :) On the other hand, if there are multiple sites, then site-specific profile would typically make sense.

If there is only one domain, storing profiles near the root would be good. eg OU=profiles,dc=unixville,dc=com. But even if there are multiple domains, it maybe best to put them in the same location anyway, using different name for profiles beloging to different domain/site. This is mostly to ensure client bind process can locate the profile easily and no guess work as to which part of the tree the profile came from.

Solaris
So far, Solaris 8 thru 10 can use the same profile without any noticeable problem. It seem easiest to create the profile from a Solaris 9 client and bind using Directory Manager to seed this process.
HPUX
Each version of HP-UX add a bit more "feature" into the ldapux-profile. 11.00 has a basic profile, 11.11 added something, yet 11.23 bloated it up even more. The nature of ldap client is that they ignore directives they don't understand. As such, it worked pretty well to seed the initial profile using an 11.23 client, binding as Directory Manager to upload the profile directly to the LDAP server. Once done, the client can be reconfigured to bind as proxy agent.

Lastly, I would like to mention that it is possible for LDAP-UX to use the Solaris profile, at least for HP 11.11. However, latest versions of the two OSes seem to diverge and maintaining separate profiles may reduce compatibility problems down the road.

LDAP Gotchas

Smart Referals
Probably want to stay away from them from NIS migration perspective. Linux is probably the only platform that supports it. AIX will traverse them and loop thru the many servers set by the smart referals and cause huge delays in telnet session connection, automount maps retrieval, and make machine extremely sluggish.
Solaris, it is a bit better. It still crosses the smart referal servers more often than needed, resulting in delayed performance. Even when profile is set that it should not follow the smart referals, it doesn't honor it. Performance is acceptable if smart referals is a must. But even then, I don't see automount maps referals to function correctly.
HP-UX. Don't know if it cause problem yet, but it doesn't provide the correct function of retrieving data from the smart referal automount maps.

RSH vs REXEC
The protocol definition for RSH vs REXEC and other similar command: (a) rsh use the Berkeley rcmd(3) library, and requires the binary to be setuid root so that it run in priviledged ports and scan for .rhosts in remote server.
(b) rexec use the rexec(3) library. It does not scan .rhosts so no need to run as root. Instead, it provide user password in clear text to remote server to login. remsh is the same as rsh, just renamed to avoid clash with the "restricted shell"
(c) rlogin is for remote console login. It handles less control character than TELNET. Most system implements "rsh host" w/o a command as a call to rlogin for interactive login. RSH is left for executing command on remote host and streaming the output back to the source machine. RLOGIN also uses the Berkeley rcmd(3) library.
(d) rcp is remote copy akin to rsh wrapped around files directly.
(e) ssh default behaviour is to emulate rsh with command. For interactive login, it probably follows the rlogin algorithm, but hangles control characters much better, akin to telnet. Further, ssh is smart enough not to read user login/password from a std out redirect, but must ask on an interactive keyboard (expect can emulate keyboard login).

Note that
(1) RSH/RLOGIN/SSH scan user db differently than
(2) TELNET/REXEC
At least in AIX, the former set seems to invariably look for local /etc/passwd user first, whilst the latter look at LDAP. nsswitch.conf on other system may have effect on this, but Solaris 8 RSH is looking at local passwd first for sure. Further note that AIX 5.3 "rsh hostname cmd" behaves more like (2). It is a bizare world of intricate details and buggy implementations a/o protocol :-/ Note that aix man page on rexec says it only look at COMPAT but in fact it look at LDAP first!

AIX has a variable AUTHSTATE that get set to:
  • compat when login using rsh
  • LDAP when login using telnet
  • files can be set manually to force UID to username resolution thru files first (still go thru other method if it fails to find it in files).
    It seems safe to set this variable in /etc/profiles and /etc/csh.cshrc to "compat" and all seems to work well, sourcing local files first.

    RSH issues
    RSH is likely to work in LDAP environment. However, getting RSH to allow user to login without prompting for password (use of the insecure .rhosts, etc) is likely to be more problematic. Solaris may need patches and config on pam.conf. HP-UX will need update on pam.conf, and perhaps a myriad of other config depending on security model. Refer to HP PAM doc.

    SSH issues
  • If you compiled OpenSSH from source, you will need to include PAM support.
  • If you got openssh package from SunFreeware.com, the older version won't work with LDAP.
  • If you are using vendor provided SSH, and vendor supplied LDAP libraries, then all should be good. If on the other hand third party LDAP libraries are added, eg from PADL.com, then things may break.
    Automount issues
    The typical sys admin would put lot of automount maps on NIS, and just about every unix OS automount can retrieve these maps from NIS. However, such assumption would be very wrong when migrating to LDAP. While the LDAP server can store anything one throws at it, the data may not always be retrievable :( There is an rfc defining how to store automount maps in LDAP, and when configured correctly, will work. But it need substantial client-side support. Autofs on just about every OS client need to be updated. Solaris 9 and older need patches. AIX 5.2 and older don't have upgrades available. HP-UX 11i can use the Enhanced-AutoFS package, but it need lot of OS patches and kernel recompilation to support it. Newer Linux can run up2date to patch the autofs rpm. See the client section below for more details. In general, treat each version of each OS as an independent entry in the configuration matrix, and test everything to ensure your config works!

    LDAP command

    ldapsearch

    Query ldap directory server info, output in LDIF format.
    Sample ldap search commands...
    solaris 10 argument structure:
    ldapsearch -b SearchBase [options] FILTER [attributes]
     [options]
     -h ldaphost # ldap server to connect to, default to localhost
     -D bindDN # user used to connect to LDAP, default to anonymous
     -d n  # debug level, bits flags. 
     -e   # minimizes base-64 encoding (like tab!)
     -T  # don't fold/wrap lines.  ldiff treat lines starting with space as
       # continuation of previous line, def width is 80 chars.
     -p 1234  # use port 1234 (default ldap use 389, TLS is 636)
     -L  # ...
     [attributes]
     select the addributes to list.  Default to all, but can limit to display only a certain ones, eg:
     dn   # list only the dn entry
     dn cn  # list both dn and cn entries, nothing else.
    
    
    
    <!-- -->
    
    ldapsearch -b "dc=unixville,dc=com" -h ldapsvr        uidNumber=5001
    ldapsearch -b "dc=unixville,dc=com" -h ldapsvr -p 389 gidNumber=5001
     # find entry with a given uid or gid number.
    ldapsearch -b "dc=unixville,dc=com" -h ldapsvr memberUid=tin dn
     # find all gruops whereby tin is a member of (unix secondary group membership)
     # display only the dn (name of group and "domain" group is defined in)
    ldapsearch -b "l=sf,l=us,dc=unixville,dc=com" -h ldapsvr uid=* dn
     # list all user in a the "domain" l=sf,l=us
    ldapsearch -b ou=Groups,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com -h ldap007 cn=* dn
     # list all group names of a given domain.
    
    
    ldapsearch -b "dc=unixville,dc=com" -h ldapsvr "uid=tin*" dn cn uidNumber
     # find all username starting with tin, display only the fields dn, cn, uidNumber.
    ldapsearch -b "ou=us,dc=unixville,dc=com" -h ldapsvr "givenName=*tin*" dn givenName uidNumber
     # find all user real name containing tin anywhere, case insensitive
    ldapsearch -b "ou=us,dc=unixville,dc=com" -h ldapsvr -D "cn=Directory Manager" "givenName=tin" userPassword
     # -D = perform search using specific user credentials
     # Certain attributes such as shadow password can only be retrieved by
     # priviledged user.
     # Finally, some info is only available on the Directory Server (eg via
     # export) but not as ldapsearch at all.  eg attributes for Person entry: 
     # creatorsName, modifiersName, modifyTimestamp, nsUniqueId
    
    
    ldapsearch -b "cn=config" -h ldapsvr -D "cn=Directory Manager" "objectClass=*"
     # retrieve config info, objectClass=* serve as wildcart for "all"
    ldapsearch -b "cn=config" -h ldapsvr -D "cn=Directory Manager" "objectClass=*" | grep  passwordStorageScheme
     # grep for the password encryption scheme (crypt, ssha, etc).  
     # aix 5.3 only supports crypt
     # solaris and linux support both crypt, ssha.
    
    ldapsearch  -b "cn=schema" -h ldapsvr -D "cn=Directory Manager" "objectClass=*" 
     # retrieve all info on the schema of the directory tree
    
    ldapsearch -h ldapsvr  -b "o=NetscapeRoot" -D "cn=directory manager" "objectClass=*" 
     # retrieve fedora directory server internal config info
     # NetscapeRoot cuz fedora/redhat ds is based off the old netscape directory server 
    
    ldapsearch -h ldapsvr -L -b automountMapName=auto_master,l=sf,l=ca,c=us,dc=element50,dc=com objectclass=*
     # something similar to "ypcat auto.master"
    
    ldapsearch -h ldapsvr -T -b automountMapName=auto_home,ou=us,dc=unixville,dc=com  objectClass=*  dn                   | grep -v ^$ 
    ldapsearch -h ldapsvr -T -b "ou=us,dc=unixville,dc=com"                          automountkey=*  automountInformation | grep home
     # list automount maps entries for auto_home, similar to "ypcat auto.home"
    
    ldapsearch -h ldapsvr -T -b automountMapName=auto_home,ou=us,dc=unixville,dc=com  automountKey=tin
     # retrieve automount info about /home/tin
    
    ldapsearch -h ldapsvr -T -b dc=unixville,dc=com  automountkey=/home
     # find out where /home is refered and how it is defined (auto.master, auto_master, which domain/ou)
    
    ldapsearch -h ldapsvr -b dc=unixville,dc=com nisnetgrouptriple=*lungfish* | less
     # find out which netgroup a machine called lungfish belongs to, long output!
    
    

    AIX
    ldapsearch is located in /usr/ldap/bin. Parameters are similar to Solaris.
    Linux native
    Parameters used by /usr/bin/ldapsearch from the opendap-client rpm, most of them are similar to the Solaris ldapsearch:
    ldapsearch [options] FILTER [attributes]
     [options]
     -x   # no SASL (option not in Solaris)
     -LL  # suppress comments in output
      -b SearchBase # specify the starting point where search will begin.  Typically root.
     -h ldaphost # ldap server to connect to, scan /etc/ldap.conf if configured.
     -D bindDN # user used to connect to LDAP, default to anonymous
     -d n  # debug level, bits flags. 
    
               [------------- options -------------]   [-- FILTER (req) --] [attr]
    ldapsearch -b dc=hybridauto,dc=com -h ldap007 -x   nsds5ReplConflict=*    dn    | grep -v ^$
     # find all entries with replication conflict problem, 
     # where dn is has nsuniqueid appended to it.  eg:
     # nsuniqueid=f0b6791e-1dd111b2-80dba88a-997d0000+uid=duptest,ou=people,dc=hybridauto,dc=com
     
    

    FEDORA-DS
    Parameters used by /opt/fedora-ds/shared/bin/ldapsearch installed by the RedHat/Fedora DS:
    Some strange Linux machines default ldapsearch
    Probaly old school linux machines...

    ldapsearch -x -ZZ -s "dc=unixville,dc=com" -b ""
     -x  = no SASL
     -ZZ = use TLS
     -s  = search base
    
    
    

    ldapadd

    ldapadd will add info to the directory server, erroring out if the entry already exist (as defined by the dn). Must be done when the Directory Server is running, live. (ldif2db will overwrite, see below).
    FEDORA-DS
    ldapadd -x -W -c -D "cn=Directory Manager" -h ldapsvr -f data.ldif
     ldapadd is really "ldapmodify -a", so it share the same options, see below
    
    
    Sample data.ldif file used to add a user, a group and simple automountmap entry for the home directory.
    
    #
    # add a user 
    #
    dn: uid=tin,ou=People,l=sf,c=us,dc=unixville,dc=com
    uid: tin
    cn: Tin Ho
    givenName: Tin
    sn: Ho
    mail: tho01@yahoo.com
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    objectClass: posixAccount
    objectClass: top
    userPassword: {crypt}solarisShadowOk
    loginShell: /bin/bash
    uidNumber: 168
    gidNumber: 168
    homeDirectory: /nfshome/tin
    gecos: optional NIS gecos field 
    
    #
    # eg for adding a group
    #
    dn: cn=sn-group,ou=Groups,l=sf,c=us,dc=unixville,dc=com
    objectClass: posixGroup
    objectClass: top
    cn: sn-group
    gidNumber: 168
    memberUid: moazam
    memberUid: rlee
    memberUid: lys
    
    #
    # eg for automount entry (automount object need to be already defined prior to this add)
    # this form is acceptable by Solaris and new Linux autofs (ditto for Aix and Hpox, 
    # but the old linux autofs will not understand it, so get rpm 4.1.3-174 or newer)
    #
    dn: automountKey=tin,automountMapName=auto_nfshome,l=sf,c=us,dc=unixville,dc=com
    objectClass: automount
    automountKey: tin
    cn: tin
    automountInformation: -rw casper:/export/home/&
    
    When first setting up LDAP repository, initial maps for auto.master, auto.nfshome, etc need to be defined. It maybe easier to do this using the GUI, see below. The LDIF files defined here can be used for addition or verification in subsequent ldapsearch. Pay special attention to dot(.) vs underscore(_) below.
    
    #
    # auto.master direct map (Linux) 
    # 
    dn: automountMapName=auto.master,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: top
    objectClass: automountMap
    automountMapName: auto.master
    
    dn: automountKey=/nfshome,automountMapName=auto.master,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: automount
    automountKey: /nfshome
    cn: /nfshome
    automountInformation: ldap:automountMapName=auto_nfshome,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    
    dn: automountKey=/net,automountMapName=auto.master,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: automount
    automountKey: /net
    cn: /net
    automountInformation: -hosts
    
    
    #
    # auto_master direct map (Solaris?)
    #
    dn: automountMapName=auto_master,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: top
    objectClass: automountMap
    
    dn: automountKey=/nfshome,automountMapName=auto_master,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: automount
    automountKey: /nfshome
    automountInformation: auto_nfshome -rw,hard,intr,vers=3,rsize=32786,wsize=32786
    
    dn: automountKey=/net,automountMapName=auto_master,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: automount
    automountKey: /net
    automountInformation: -hosts
    
    #
    # auto_nfshome
    #
    dn: automountMapName=auto_nfshome,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    objectClass: top
    objectClass: automountMap
    
    
    

    ldapmodify

    Change existing data already stored in the Directory Server
    Solaris
    
    ldapmodify -D uid=tinh,ou=people,dc=geneusa,dc=com -h ldapsvr -p 1389 -f ./data.ldif
    
    
    FEDORA-DS
    ldapmodify -x -W -c -D "cn=Directory Manager" -h ldapsvr -f data.ldif
     -h specify the server to connect to to perform the add
     -f FILENAME, if using path with filename, must use /full/path/to/file
        If no filename is defined, ldapmodify expect all commands to come from std in, one line at a time;
        empty line by itself to indicate end of record.
     -x = simple auth instead of SASL
     -W = prompt for password on the CLI
     -c = continuos operation, instead of exiting when errors happens
     -D USER = the user to perform the change as
    
     -v = verbose
     -n = dry run, don't acutally do anything
    
    
    #
    # modify user account try adding the objectClass=shadowAccount
    # so that user can login to Solaris 8 and related machines
    # Note that some ldapmodify binary may crook on comments!!
    # (Solaris and many Linux can't parse #)
    # Blank lines are potential problems, so avoid them :)
    #
    dn: uid=tin,ou=People,l=sf,c=us,dc=unixville,dc=com
    changetype: modify
    add: objectClass
    objectClass: shadowAccount
    
    
    #
    # Add a password field to user whose account have empty password
    # ie, no userPassword clause definated at all
    #
    dn: uid=mlee,ou=People,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    changetype: modify
    add: userPassword
    userPassword: {crypt}*notSet*
    
    
    
    #
    # Change user password field to indicate that it is in locked state.
    #
    dn: uid=tho,ou=People,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    changetype: modify
    replace: userPassword
    userPassword: {crypt}*AccountLocked-2006-07-26*
     
    #
    # Change account to lock state, not all OS honor this.
    #
    dn: uid=tho,ou=People,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com
    changetype: modify
    add: nsAccountLock
    nsAccountLock: true
    
    
    
    
    #
    # Change a group definition: add a user to its membership list
    #
    dn: cn=sysadmin,ou=Groups,ou=sc,ou=ca,ou=na,dc=hypbridauto,dc=com
    changetype: modify
    add: memberUid
    memberUid: tho
    memberUid: mlee

    Import/Export

    db2ldif

    /opt/redhat-ds/slapd-ldapsvr/db2ldif -s dc=unixville,dc=com -a export.ldif
    Export all site data for the domain unixville.com, place it in LDIF format. This file can then be edited, eg sed 's/\t/ /g' to remove all tabs, and then reimported.
    For the -a option, output file may need to have full path specified, and output location may need to be writable by user "nobody". If option -a is not used, the output will be stored automatically in a location where "nobody" can write to, typically /opt/redhat-ds/slapd-ldapsvr/ldif.

    /opt/redhat-ds/slapd-ldapsvr/db2ldif -n userRoot
    Brief test, output info related to "default master db" that stores root-level data, such as profiles, etc. It does not output data in sub-suffix databases related db that stores user data. so it is usually small.

    ldif2db

    /opt/redhat-ds/slapd-ldapsvr/ldif2db -s "ou=us,dc=unixville,dc=com" -i /full/path/to/import.ldif
    Import data in ldiff format. Typically req to be importing a specific section of the domain. It will overwrite data that already exist under the subtree of the import. The import has to be done when the directory server is turned off. If using replication with multiple servers, the other machines need to be re-init from this master server that did the import.

    LDIF

    LDIF is the standard data exchange format. It is a simple ASCII file with special text formatting. Each entry has a dn. Different entries are separated by blank lines. line that begin with a space is considered to be continuation of previous line, which are typically wrapped at 80 chars. Anything not presentable in standard ASCII will be uuencoded in base64.
    Besides LDIF, there is also format based on the XML standard called ____. It seems to be seldom used.
    < !--
    Additional example of ldif file for various objects can be found at: http://docs.sun.com/app/docs/doc/816-4556/6maort2ro?a=view

    FEDORA-DS

    Fedora Directory Server is the open source version of RedHat Directory Server.
    The RH DS came from the Netscape Directory Server (6.1). The Netscape DS was also developed by Sun and branded as iPlanet DS and later Sun ONE DS. HP also repackage the Netscape/Red Hat DSwith their LDAP-UX. Thus, these products are largely the same.

    GUI Console

    Fedora DS has a Graphical Console. There is two piece to it, an app/web server that typically configured to listen on port 5555, which can be started as:
    cd /opt/fedora-ds
    ./start-admin
    And a client part. A web browser can point to http://ldapsvr:5555/ and get lightweight ldap server queries, or use the full featured java client which get started as: cd /opt/fedora-ds
    ./startconsole
    The GUI console can be used to perform admin task on the Directory Server configuration or for modifying data in the LDAP data tree. Below are several examples of adding objects in the "database" tab.

    Add Unix User

    Right click on the People OU, click new, user. For Unix user, check on the Posix Account entry. Furthermore, go to advance attributes and add the "shadowAccount" attribute.

    Add Unix Group

    Right click on the Groups OU, click new, other object, posix group.

    Add Automount Map

    automount map is for defining an entirely new set of automount entries, such as /import, /products, etc.
    Here is an example for defining automount maps in LDAP for the first time, adding a auto.master, auto_master and finally a sample indirect map for /nfshome . Pay special attention to dot(.) vs underscore(_) below.

      Defining auto.master (master map used by Linux, AIX and HP-UX?)
    1. Right click on the desired OU where the automount map should be added (eg, ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com)
    2. Click new, other object, automountMap
    3. Enter "auto.master" in the automountMapName field.
    4. Then, inside this newly created auto.master object, right click, new, other object, automount
    5. In the automountInformation field, enter info like "ldap:automountMapName=auto_nfshome,ou=sc,ou=ca,ou=na,dc=hybridauto,dc=com"
    6. In the automountkey field, enter "/nfshome"

      Defining auto_master (master map used by Solaris?)
    7. Right click on the desired OU, as in step 1 above
    8. Click new, other object, automountMap
    9. Enter "auto_master" in the automountMapName field.
    10. Then, inside this newly created auto_master object, right click, new, other object, automount
    11. In the automountInformation field, enter info like "auto_nfshome"
    12. In the automountkey field, enter "/nfshome"

      Defining auto_nfshome (indirect map used by all systems)
    13. Right click on the desired OU, as in step 1 above
    14. Click new, other object, automountMap
    15. Enter "auto_nfshome" in the automountMapName field.
    16. Entries in auto_nfshome will be added in the section below "Add Automount Entry".
    The info can be verified by using ldapsearch and compare it against the ldif used in the ldapadd section above.

    Add Automount Entry

    Sample automount entries would be the various user's home dir that are defined under /home. Thus, /nfshome/tin with content of nfssvr:/export/home/tin would be one automount entry.
    For user home directory that need atuomount entry to define which server to use for the actual mount, To add a new entry:
    1. right click on the desired automount map (eg auto_nfshome)
    2. click new, other object, automount.
    3. Enter the mount key into the automountKey field
    4. Enter mount path info and mount option into the automountInformation field

    Change Default Posix (Unix) Password Encryption

    When creating unix user account using the GUI console, the password strings need to be encrypted in the same method that client OS expectes them to. While newer OS can support MD5 or SSHA, CRYPT is still the universal standard. Thus, it is best to set the console to encrypt user password using CRYPT by default. Solaris NIS use CRYPT by default.

    Method 1 - What Red Hat consultant told me and I know it works:
    1. Go to the Directory tab
    2. Click on the config node in the tree, right click, select advance properties (in a nutshell, we are editing the properties for the node cn=config.
    3. Scroll to the bottom where the attribute "passwordStorageScheme" is listed. By default, it says SSHA. Change it to say "CRYPT"
    Method 2 - What is described by the Admin Guide:
    1. Go to Config tab,
    2. Data node in the LDAP tree.
    3. On right side, go to Password tab. Toward the bottom, it has option to choose password encryption policy. The default is SSHA, change it to CRYPT.

    To verify that default password is created using CRYPT, perform a password change on a test user, then perform ldapsearch on this user using the cn=Directory Manager credentials. Encryption method is prefixed as {CRYPT} or {SSHA}.

    Backing up DB

    /opt/fedora-ds/slapd-SVR/db2bak  # run the backup
    /opt/fedora-ds/slapd-SVR/bak/DATE/ # dir where the files are stored (mostly Berkeley DB files)
    

    Index

    Some fields are not indexed by default, and when used as a NIS replacement, having these indexes would greatly improve performance.
    • gidnumber
    • uidnumber

    To create indexes,
    1. Go to config tab
    2. Click on "data" on the ldap tree
    3. Expand to the desired database container config, indexes tab should appear on right
    4. Click on attribute, and select the attribute that should be added to the index.
    5. Typically, equality is the box that need to be checked for indexing to be performed on.
    The above steps need to be done on each LDAP server, on each database repository (eg when different domains/OU use different database backend).

    Replication Conflicts

    Records from replications with conflicts are marked by nsds5ReplConflict. If the conflicts are specific to a single domain (and espcially in a dedicated consumer), the easiest way to remove them is to "Initialize the Consumer" from the Replication Config.

    Otherwise, refers to http://www.redhat.com/docs/manuals/dir-server/ag/7.1/replicat.html#1106141

    Replication Agreement


    Example of setting up a new server for a new domain, and adding replication.
    Step# Central Master Server (eg ldap1) Local Replica (new server, eg ldap3)
    I. Setup Directory Server Admin metadata
    0   Run ./setup to configure the directory server.
    Enable "Change Log", db dir = /opt/redhat-ds/slapd-ldap*/changelogdb/ (owned by nobody:nobody, chmod 755)
    restart slapd
    1   Create user "uid=replication manager" under "cn=config" (in Directory tab)
    2   Remove unecessary data from the directory tree (eg: People, Group, Special users)
    3   For replication, userRoot, ensure to
  • check "Enable Replica"
  • uid=replication manager,cn=config
  • dedicated consumer
  • 4

    Replicate "userRoot" data to the new server

    This will setup the root dc=hybridauto,dc=com to the new remote server.

     
    II. Replicate existing domain/database to new server
    5   In config tab, create databases matching all desired subsuffices for the data/domain that wish to be made available in this new server
    6   Check to ensure all subsuffix for db defined above exist, they should have been replicated from Part I.
    7   Enable replication (on Local Replica, this would typically be dedicated consumer, so read-only).
    8 Add replication agreement for each database/domain defined in step 5.  
    III. Setting up new domain, getting data to new server
    9 In config tab, create new subsuffix with db for it. eg, create ou=seattle,ou=on,ou=na,dc=hybridauto,dc=com  
    10 In database tab, create a matching subsuffix  
    11 Enable replication. (On Central Master Server and Backup Master Server, this would need to be a "multi-master" replicatoin)  
    12   In config tab, create subsuffix matching that created in step 9, create the db with it to store the data locally
    13   In database tab, create ou matching above (may not really need to create this manually)
    14   Assign a unique replication id to this machine to use (if servers are numbered sequentially, this is a good number to use).
    15   Enable replication (It maybe desirable to setup the Local Replica to act as Local Master for this domain that host local user, as such, the replication would be a "multi-master", and not dedicated consumer.
    IV. Setup Replication agreement
    16 In config tab, replicatin branch in the tree:
  • centralServer-seattle, new agreement: SE-central2se
  • consumer is ldap3:389
  • simple auth, uid=replication manager, cn=config (created in step 1, enabled for writing in step 3)
  • Do NOT initialize consumer (at this time).
  •  
    17   In config tab, replication branch of the tree, create a "back-fill" replication from the local master back to central master, db name: localMaster-seattle:
  • right click and add a new replication agreement.
  • call it SE-SE2central
  • consumer: ldap1:389 (central master server)
  • Do NOT initialize conumer
  • 18

    Tail the error log (slapd-ldap1/logs/error).
    Initialize consumer.

     
    V. Replication with Backup Master - repleat stage IV, adjust as:
    19 (step 16): name replication agreement as SE-backup2se, multi-master replication  
    20   (step 17) name for replication:
  • localMaster-SE
  • SE-SE2backup

  • When defining the replication agreements, for database that need multi-master, it maybe best to select such type of replication from the beginning. This is because the GUI console is rather buggy, and it has a tendency to put in a random number if the replication is first setup as dedicated consumer. If error happens, it is probaby best to blow away the replication agreement and redo it. For those really brave, the slapd server can be shut down and edit the dse.ldif file manually :) It was actually possible to define the replication agreement as ldif file that one just paste into the dse.ldif, but I don't konw the details.

    Other things to remember when setting up additional server:
  • Copy custom schemas into the schema directory and restart the DS. /opt/redhat-ds/slapd-*/config/schema, typically starts with number 61 thru 98.
  • Uniqueness plugin, either thru ldif import or set them up manually on the GUI console.
  • Tail the error logs when doing replicaiton test. Sometime console reports status as good but when in fact error log shows replication problems.
  • Ensure each server is assigned a uniq replicaiton id. Last octet(s) of the server IP address may work. Naming the servers with numerical sequence and using that number may also be a good idea.
  • Perform any server-specific changes to each machine, eg change password encryption from default SSHA to the more compatible CRYPT.

    Server Transfer


    If there is ever a need to migrate the RedHat Directory Server from one physical server to another, here is one possible method. This method assumes that you have at least one other server that can pick up all the workload, or downtime is acceptable.
    1. Stop slapd
    2. Create a tar of the whole /opt/fedora-ds
    3. Transfer this tar to the new server
    4. Untar into /opt/fedora-ds (rename the existing dir if the fedora rpm has already been added)
    5. Shutdown or otherwise network disconnect the old ldap server
    6. Configure the new LDAP server to have hostname (and optionally IP) of the old server
    7. Start slapd on the new server

    LDAP Client Config and Troubleshooting

    Solaris

    Config files:
  • /etc/pam.conf
  • /etc/nsswitch.conf
  • 
    /etc/init.d/ldap.client stop  # restart ldap client bind process
    /etc/init.d/ldap.client start
    
    svcadm enable network/ldap/client # solaris 10
    
    /usr/lib/ldap/ldap_cachemgr -g   # generate a new cache, display status
    
    /etc/init.d/nscd stop   # restart name service daemon
    /etc/init.d/nscd start
    
    
    

    ldaplist

    Solaris support the ldaplist command that does the equivalent of ypcat in the NIS world. It is easier than ldapsearch in the sense that it will use the BASEDN and ldapserver that is already configured for a given ldap client machine. If ldaplist work, then the client is correctly configured to talk to the ldap server (whereas ldapsearch just means the ldap server is reachable). It is probably tuned to work with the Sun One Directory Server, but many specific queries will work with RedHat/Fedora DS also.
    Also, command works better when ldapclient init is used, rather than hacking the machine to work by copying files to /var/ldap.
    
    ldaplist    passwd "*"  # list all user, equiv of ypcat passwd
    ldaplist -l passwd tin  # display detailed info about user tin
    ldaplist -l group  \*  # list all groups and their members
    
    ldaplist    auto_master  \* # list master automount info, like ypcat -k auto.master
    ldaplist -l auto_nfshome tin # give specific details for /nfshome/tin
    
    ldaplist -l aliases root # find out email alias definition for user root
    
    
    


    AIX

    AIX as LDAP client
    • 5.1: Only support AIX LDAP schema. To bind to OpenLDAP or Fedora DS, need to use PADL nss_ldap module.
    • 5.2: Provided a mapping function and thus can use standard RFC2307 from any ldap server. But can only bind using cn=Director Manager, and the password is stored as clear text in the ldap.cfg file (def it is only readable by root).
    • 5.3 Base: same support as 5.2
    • 5.3 w/ ML3: mksecldap -a ldap_auth allows binding using proxyagent. password is encrypted in ldap.cfg
    • 5.3 w/ ML4: provides updated automount (nfs.client.rte 5.3.0.40), allowing it to queries maps from ldap server directly.

    Config and Test Commands
    mksecldap -c -h ldap03.hybridauto.com -a "cn=Directory Manager" -p bigsecret -d "dc=hybridauto,dc=com" -u NONE
    ## Bind as Directory Manager, kinda bad.  
    ## Some older sys password is in clear text in teh ldap.cfg file!!
    
    mksecldap -c -h ldap03.hybridauto.com -a "cn=proxyagent,ou=profile,dc=hybridauto,dc=com" -p secret -A ldap_auth -d "dc=hybridauto,dc=com" -u NONE
    ## Works for AIX 5.3 with ML 3 patches, bind for authorization only, using
    ## proxyagent  (which is just a normal People OU in the profile OU).
    ## One can edit the ldap.cfg file and remove the user and password for
    ## anonymous bind.
    
    
    
    
    lsuser -R LDAP tin # see if user "tin" is defined in LDAP
       # AIX command, in /usr/sbin
    
    ls-secldapclntd  # check status of ldap connectivity, in /usr/sbin
    stop-secldapclntd
    start-secldapclntd
    restart-secldapclntd
    flush-secldapclntd
    
    
    Config Files
  • /etc/security/user
  • /etc/security/ldap/ldap.cfg
  • /etc/irs.conf
    1. Update needed if using PADL nss_ldap module.
    2. No changes needed in AIX 5.3, and there should be no clause for automount if it is to query LDAP for the maps (stating that it should query from LDAP makes it fail!)
  • /usr/lib/security/methods.cfg. LDAP clause need to be present, added by ldap.client.rte
  • /etc/pam.conf (no changes are typically required)
  • /etc/inittab has extra clause for ldap:
    ldapclntd:2:once: /usr/sbin/secldapclntd > /dev/console 2>&1
    Any service that depends on ldap user auth need to be placed below this line (eg Clear Case, custom rc script that start services using LDAP user credentials)
  • /etc/profile, /etc/csh.cshrc:
    $AUTHSTATE (env variable: LDAP, compat, files. It define default, first method for UID resolution)


  • /etc/security/user config details
    the "default" clause should say something like:
    default:
     SYSTEM = "LDAP or compat or DCE"
     (...)
    
    This would allow local user account to be checked. If the order is "compat or LDAP", ldap user who telnet in will see a small error message about "invalid login name or password" and then move on to LDAP and log the user right in (assuming currect pam.conf). If the order is "LDAP or compat", then somehow local user can still login even if there is matching USERNAME on LDAP with /bin/false for shell. IBM doc says DCE is used for X windows login, but seems to work w/o it anyway.
    Lastly, if somehow local user still doesn't work, then a specific clause for the user need to be added, one that looks similar to the entry for the root user. Either run commands like:
    chuser SYSTEM=compat db2inst8
    chuser SYSTEM=compat registry=files db2inst8
    (registry deal with where the password is stored. For LDAP, it would be "registry=LDAP"). Alternatively, one can edit the file manually. Final config should look like:
    db2inst8:
     SYSTEM = "compat"
     (...)
    


    IBM claims that the extension, RFC2307AIX, provides additional support for AIX 5.2 onward. It was supopsed to be a transparent addition that does not affect other clients that do not use such feature. The ldif file for schema update is provided on AIX machines at /etc/security/ldap/nisSchema.ldif . This extra support is not required, but provides AIX with additional user tracking support.

    Linux

    authconfig
    or
    /etc/ldap.conf
    /etc/nsswtich.conf
    
    anonymous bind works if server allows it, proxyagent bind wilL need to put password 
    in a separate file 600 root and contain password in clear text.
    
    
    automount: 
    /etc/sysconfig/autofs, define BASEDN so that it will locate the correct auto*master 
    autofs rpm version at least 4.1.3-174 need to be available to support maps retrieval thru LDAP.
    
    


    HPUX

    Config files:
    
    /etc/opt/ldapux/ldapclientd.conf # LDAP-UX daemon config file
    /etc/nsswitch.conf
    /etc/pam.conf   # can use pam.ldap as template
    
    Config commands:
    
    swlist -l product | grep -i ldapux
    # Ensure that ldapux package is installed.  
    # Need at least version B.03.30...
    
    cd /opt/ldapux/config
    ./setup
    # this is the main config for the ldap-ux module
    # to configure HP-UX to use ldap for user authentication
    # it is an interactive program
    # will ask ldap server name, port, and 
    # the hpux/ldapux profile dn path/location
    
    
    cd /opt/ldapux/config
    ./ldap_proxy_config -i
    cn=proxyagent,ou=profile,dc=unixville,dc=com
    proxyagentpass
    # configure ldapux to use a proxyagent
    # the two lines after the command are entered after the command is issued
    # there is no prompt, just enter "username" and password, 
    # one line at a time, and then the command prompt will return
    
    
    ./ldap_proxy_config -p
    # print out the config setup above
    
    ./ldap_proxy_config -v
    # verify proxy agent config, should return "verified - valid"
    
    Automount:
    
    swlist -l product | grep -i auto
    # Ensure Enhanced Autofs is installed
    # Need at least version ...
    
    
    
    


    Generic

    These are commands available in most OS, unless otherwise specified.
    
    id -a tin  # see id of user (all platform, but diff flags)
    
    getent    # get entries from administrative database as def by nsswitch.conf
         # avalable on solaris, linux, hp-ux.
              # No service that look up automount maps :(
    getent passwd tin     # see if user tin is recognized
       # similar to ypcat passwd | grep tin
       # but would work against LDAP source.
    getent hosts  # get list of hosts , but don't retrieve from DNS
            
    
    
    perl -e "print crypt('clear-text-password','salt');"
       # generate the CRYPT encrypted version of the string
       # clear-text-password usnig the first two letter as
       # the salt to seed the encryption.
       # CRYPT is the default password encrypting scheme
       # for solaris /etc/shadow and many other unices.
       # {CRYPT}encrypted entry can be used in ldif file
       # for password import into LDAP POSIX User account.
    

    Reference

    Online

    • PADL (LDAP spelled backward), is a company that provides many LDAP modules for many OS, as well as conversion scripts, NIS Gateway, etc.
    • SymLabs LDAP online training tutorial
    • Online article about LDAP and how company will eventually have to adopt it.

    Red Hat


    IBM AIX


    Books

    • Sun Geeks Guide to Native LDAP A Native LDAP Blueprint. Jim Covington. ISBN 1419630288. (c) 2006.
      This is the only book that I have found which covers native OS config to hook it up to LDAP for basic user authentication. It covers Solaris, Red Hat Linux, AIX and HP-UX. It is centered on the Sun Java System Directory Server, but it is largely the same as the Netscape/RedHat DS, so most tasks should be carriable over. For the server config, it provides steps of converting NIS data, SSL, Certificates, Replications, etc.
      If you are a sys admin having to convert your heterogenous environment from NIS to LDAP, this book would be a necessity. But be warned, this is NOT a comprehensive guide. The book is very slim, 124 pages in all, with 41 figures and additional text-based output that cover most of the pages. The author does take the admin on a step-by-step config for setting up the clients. However, there are little explanations of what the different command does, or how to troubleshoot a problem which invariably will happen.
      This book would server as a good reference for those admins who has already done the setup but need a quick guide to remind them of the detailed steps involved in each OS. For those doing it a first time, the book would serve as an guide but more info would need to be digged out, yet there are no links or pointers from this book.
      Lastly, the book seems to have started as web-based documentation but now got it printed. This isn't bad, but the table of contents have page numbering that is not quite accurate. Furthermore, the many screenshots in the book would be more meaningful if additional text are added to explain what was clicked to get to the view presented.

    • LDAP System Administration. Gerald Carter. O'Reilly. ISBN 1565924916. (c) 2003.
      This book covers mostly the open source LDAP server, which lacks multi-master replication, thereby rendering of limited use in enterprise avoiding SPOF (but it does support simple replications). It starts with concise technical overview of LDAPv3, then move on to describe how to create a simple OpenLDAP server to support a simple Directory. It does cover some specifics of client config regarding pam_ldap, nss_ldap, OpenSSH, FTP, HTTP and email/MTA. However, it lacks details on how to configure the very many OS when the admin is seeking to replace NIS, which works very well with the native telnet, rlogin, CDE, etc.

    • Understanding and Deploying LDAP Directory Services, 2nd Ed. Howes, Smith, Good. ISBN 0672323168. (c) 2003.
      This book gives a comprehensive review of what LDAP is, what it can do, examples of deployment, and centered mainly the original Netscape DS. Other than specific of the LDAP schema and LDIF fragments, a review of the Netscape DS in Ch 4, a good working overview of the replications in Ch 22, this book is largely high level business talk. It doesn't have much details about any client connectivity, or the specifics on how to configure and maintain the Netscape DS. The high level text that you need to do backup, have rigorous testing and go live plan, etc are of little relevance to the Sys Admin who is working on a commited project. It concludes with a discussion of Perl Net::LDAP which would be very useful for scripting project after LDAP is established.

    • LDAP Directories Explained: An Introduction and Analysis. Brian Arkills. ISBN 020178792X. (c) 2003.
      Again, high level book with little technical details. Good starter guide as to what LDAP provides. There are some rather theoretical details in the appendices, usable as a quick reference for LDAP Architect. But limited use for Sys Admin.

    • Sun LDAP Blueprint.
      This book cover only Sun's product. The Sun DS discussion is probably applicable to the DS branded by other vendor, but the client OS config is solely on Solaris, and it has lot of intricacies used by itself.


    [Doc URL: http://www.grumpyxmas.com/ldap.html]
    [Doc URL: http://www.cs.fiu.edu/~tho01/psg/ldap.html]
    (cc) Tin Ho. See main page for copyright info.


    "LYS on the outside, LKS in the inside"
    "AUHAUH on the outside, LAPPLAPP in the inside"
  • Saturday, November 10, 2012

    netapp

    <-- Please click if you found this site useful ;-)

    NetApp

    NetApp 101

    https://netapp.myco.com/na_admin  # web gui URL.  Most feature avail there, including a console.
    
    ssh netapp.myco.com   # ssh (or rsh, telnet in) for CLI access
    
    get root mount of /vol/vol0/etc in a unix machint do to direct config on files.
    
    
    
    NOW = NetApp Support Site
    NetApp man pages ("mirror" by uwaterloo)
    RAID-DP

    IMHO Admin Notes

    Notes about NetApp export, NFS and Windows CIFS ACL permission issues.
    
    Best practices is for most (if not all) export points of NFS server is to  
    implement root_squash.  root on
    the nfs client is translated to user 'nobody' and would effectively have
    the lowest access permission.  This is done to reduce accidents of user
    wiping out the whole NFS server content from their desktops.
    
    Sometime NetApp NFS exports are actually on top of filesystem using windows NT ACL,
    their file permission may show up as 777, but when it comes to accessing
    the file, it will require authentication from the Windows server (PDC/BDC 
    or AD).  Any user login name that does not have a match in
    windows user DB will have permission denied problems.  
    
    Most unix client with automount can access nfs server thru /net.
    However, admin should discourage the heavy reliance on /net.  It is good
    for occassional use.
    /home/SHARE_NAME or other mount points should be 
    provided, such as /corp-eng and /corp-it.  This is because mount path will 
    be more controllable, and also avoid older AIX bug of accessing /net when 
    accessing NFS mounted volumes, access them as user instead of root, which 
    get most priviledges squashed away.
    If the FS is accessible by Windows and Unix, it is best to make share name
    simple and keep them consistent.  Some admin like to create
    matching 
    \\net-app-svr1\share1   /net-app-svr1/share1
    \\net-app-svr2\share2   /net-app-svr2/share2
    I would recommend that in the unix side, that /net-app-svr1 be unified into a
    single automount map called like /project .  This would mean 
    all share names need to be uniq across all servers, but it help keep
    transparency that allows for server migration w/o affecting user's work
    behaviour.
    
    
    Old Filer to New Filer Migration problems:
    
    If copy files from Unix FS to Windows-style FS, there are likely going to 
    be pitfalls. NDMP would copy the files, and permissions and date would be 
    preserved, but ownership of the files may not be preserved.  XCOPY from 
    DOS (or robocopy) may work a tad better in the sense that the files will 
    go thru the normal windows access of checking access and ownership 
    creation. Clear Case needed to run chown on the files that correspond to 
    the view, and not having the ownership preserved becomes a big problem.  
    Ultimately, User that run CC script for ownership change was made part of 
    the NetApp Local Admin Group.  A more refined ACL would be safer.
    
    Filer data migration:
    
    NDMP is the quickest.  One can even turn off NFS and CIFS access to ensure 
    no one is writting to the server anymore.  NDMP is a different protocol 
    with its own access mechanism.
    
    
    Mixed NFS and CIFS security mode:
    
    Mix mode security (NT and Unix) is typically a real pain in the rear.
    Migrating from NT a/o Unix to mix mode would mean filer has to fabricate
    permissions, which may have unintenteded side effects.
    Switch from mixed mode to either NT or Unix just drop the extra permission
    info, thus some consultant say this is a safer step.
    
    Clear Case and NetApp each point the other as recommending Mixed Mode
    security.  It maybe nighmare if really used.  Unix mode worked flawlessly for
    3+ years.
    
    Different NetApp support/consultant says different things about mix mode, 
    but my own experience match this description:
    Mix-Mode means the filer either store Unix or NTFS acl on a file by file basis.
    If a given file (or dir) ACL is set on unix, it will get to have only Unix ACL on it.
    If last set on NTFS, then it will get Windows ACL.  
    The dual mode options is not both stored, only one of the two is stored, and the rest
    resolved in real time by the filer.  
    This has a nasty side effect that flipping security style from mixed mode to say NTFS,
    some files permissions are left alone and even windows admin can't change/erase the files, 
    because they are not seen as root.
    In short, avoid mix-mode like a plague!!
    
    
    

    LVM

    
    Layers:
    
    Qtree, and/or subdirectories, export-able
      |
    Volume (TradVol, FlexVol), export-able, snapshot configured at this level.
      |
    agregate (OnTap 7.0 and up)
      |
     plex      (relevant mostly in mirror conf)
      |
    raid group
      |
    disk
    
    
    Disks - Physical hardware device :)
       Spares are global, auto replace failed disk in any raid group.
       Sys will pick correct size spare.
       If no hot spare avail, filer run in degraded mode if disk fail, and
       def shutdown after 24 hours!  (options raid.timeout, in hours)
    
       sysconfig -d  # display all disk and some sort of id
       sysconfig -r  # contain info about usable and physical disk size
          # as well as which raid group the disk belongs to
    
     disk zero spare  # zero all spare disk so they can be added quickly to a volume.
     vol status -s  # check whether spare disks are zeroed 
    
       web gui: Filer, Status 
        = display number of spares avail on system
    
       web gui: Storage, Disk, Manage 
        = list of all disks, size, parity/data/spare/partner info, 
        which vol the disk is being used for.
        (raid group info is omited)
    
       Disk Naming:  
       .
       2a.17  SCSI adaptor 2, disk scsi id 17
       3b.97  SCSI adaptor 3, disk scsi id 97
    
       a = the main channel, typically for filer normal use
       b = secondary channel, typically hooked to partner's disk for takeover use only.
    
    
    
    Raid group - a grouping of disks.  
      Should really have hot spare, or else degraded mode if disk fail, and shut
     down in 24 hours by def (so can't tolerate weekend failure).
    
            max raid group size:
                  raid4    raid-dp   (def/max)
     FC            8/14     16/28 
     SATA, R200    7/7      14/16
    
     Some models are slightly diff than above.
    
    
     Raid-DP?  
     2 parity disk per raid group instead of 1 in raid4.
     If you are going to have a large volume/aggregate that spans 2 raid group (in
     a single plex), then may as well use raid-dp.
     Larger raid group size save storage by saving parity disk.
     at expense of slightly less data safety in case of multi-disks failure.
     
    
    Plex
     - mirrored volume/aggregate have two plexes, one for each complete copy of the
         data.
            - raid4/raid_dp has only one plex, raid groups are "serialized".
    
    
    aggregate - OnTap 7.0 addition, layer b/w volume and disk.  With this, NA
     recommend creating a huge aggregate that span all disks with 
     same RAID level, then carve out as many volume as desired.
    
    
    Volume - traditional mgnt unit, called an "independent file system".
         aka Traditional Volume, starting in OnTap 7.0
         Made up of one ore more raid groups. 
         -  disk(s) can be added to volume, default add to existing raid group
     in the vol, but if it is maxed out, then it will create a new raid
     group.
         - vol size can be expanded , but no strink, concat or split.
         - vol can be exported to another filer (foreign vol).
         - small vol implies small raid group, therefore waste more space.
         - max size = 250 GB recommended max vol size in 6.0
    
         vol status -v [vol0] # display status of all [or specific] volume,
               # -v gives all details on volume options
         vol lang    vol0         # display [set] character set of a volume
    
         vol status -r  # display volume and raid status
         sysconfig  -r   # same as vol status -r
    
         vol create newvol  14 # create new vol w/ 14 disks
         vol create newvol2 -t raid4 -r 14 6@136G
      # vol size is 6 disks of 133 GB
      # use raid4 (alt, use raid_dp)
      # use raid group of 14 disks (def in cli), 
      # each raid group need a parity disk, so
      # larger raid group save space (at expense of ??)
      # 28 disks usable in raid_dp only?
    
    
         vol add newvol2 3  # add 3 more disks to a volume
         vol options vol1 nosnap on # turn off snapshot on a vol
         vol offline vol2
         vol online  vol2
    
    FlexVol - OnTap 7.0 and up, resembles a TradVol, but build ontop of aggregate
        - grow and srink as needed
    
    
    QTree  - "Quota Tree", store security style config, oplock, disk space usage and file limits.
          Multiple qtrees per volume.  QTrees are not req, NA can hae simple/plain 
          subdir at the the "root level" in a vol, but such dir cannot be converted to qtree.
          Any files/dirs not explicitly under any qtree will be placed in a
          default/system QTree 0.
    
        qtree create   /vol/vol1/qtree1  # create a qtree under vol1
        qtree security /vol/vol1/qtree1 unix # set unix security mode for the qtree
                 # could also be ntfs or mixed
        qtree oplocks  /vol/vol1/qtree1 enable # enable oplock (windows access can perform catching) 
    
    
    Config Approach
    Aggregate:
    Create largest aggregate, 1 per filer head is fine, unless need traditional vol.
    
    Can create as many FlexVol as desired, since FlexVol can growth and srink as needed.
    Max vol per aggregate = 100 ??
    
    TradVol vs QTree?
    - use fewer traditional volume when possible, since volume has parity disk overhead
    - and space fragmentation problem.
    - use QTree as size management unit.
    
    
    FlexVol vs QTree?
    - Use Volume for same "conceptual management unit"  
    - Use diff vol to separate production data vs test data
    - QTree should still be created under the volume instead of simple plain subdirectories
      at the "root" of the volume. 
      This way, quota can be turned on if just to monitor space usage.
    - One FlexVol per Project is good.  Start Vol small and expand as needed.
      Strink as it dies off.
    - Use QTree for different pieces of the same project.
    - Depending on the backup approach, smaller volume may make backup easier.
      Should try to limit volume to 3 TB or less.
    
    
    

    Quotas

    mount root dir of the netapp volume in a unix or windows machine.
    vi (/) etc/quotas   (in dos, use edit, not notepad!!)
    then telnet to netapp server, issue command of quota resize vol1 .
    
    quota on  vol1
    quota off vol0
    quota report
    quota resize # update/re-read quotas (per-vol)
      # for user quota creation, may need to turn quota off,on for volume
      # for changes to be parsed correctly.
    
    Netapp quota support hard limit, threshold, and soft limit.
    However, only hard limit return error to FS.  The rest is largely useless, 
    quota command on linux is not functional :(
    
    
    Best Practices:
    
    Other than user home directory, probably don't want to enforce quota limits.
    However, still good to turn on quota so that space utilization can be monitored.
    
    
    
    /etc/quotas
    ##                                           hard limit | thres |soft limit
    ##Quota Target       type                    disk  files| hold  |disk  file
    ##-------------      -----                   ----  -----  ----- ----- -----
    
    *         tree@/vol/vol0   -       -       -       -       - # monitor usage on all qtree in vol0
    *         tree@/vol/vol1   -       -       -       -       -
    *         tree@/vol/vol2   -       -       -       -       -
    
    /vol/vol2/qtree1     tree                200111000k  75K  - - - # enforce qtree quota, use kb is easier to compare on report
    /vol/vol2/qtree2     tree                    -      - 1000M   - - # enable threshold notification for qtree (useless)
    
    
    *                       user@/vol/vol2        - -       -       -       - # provide usage based on file ownership, w/in specified volume
    tinh                    user                 50777000k -       5M      7M      - # user quota, on ALL fs ?!  may want to avoid
    tinh                    user@/vol/vol2          10M     -       5M      7M      - # enforce user's quota w/in a specified volume
    tinh         user@/vol/vol2/qtree1 100M    -       -    -       - # enforce user's quota w/in a specified qtree
                # exceptions for +/- space can be specified for given user/location
    
    
    # 200111000k = 200 GB
    #  50777000k =  50 GB
    # they make output of quota report a bit easier to read
    
    # * = default user/group/qtree 
    # - = placeholder, no limit enforced, just enable stats collection
    

    Snapshot

    Snapshots are configured at the volume level. Thus, if different data need to have different snapshot characteristics, then they should be in different volume rather than just being in different QTree.
    WAFL automatically reserve 20% for snapshot use.
    snap list vol1
    snap create vol1 snapname # manual snapshots creation.
    snap sched   # print all snapshot schedules for all volumes
    snap sched vol1 2 4   # scheduled snapshots for vol1: keep 2 weekly, 4 daily, 0 hourly snapshots
    snap sched vol1 2 4 6  # same as above, but keep 6 hourly snapshots, 
    snap sched vol1 2 4 6@9,16,20 # same as above, specifying which 3 hourly snapshot to keep + last 3 hours
    snap reserve vol1     # display the percentage of space that is reserved for snapshot (def=20%)
    snap reserve vol1 30  # set 30% of volume space for snapshot
    
    vol options vol1 nosnap on # turn off snapshot, it is for whole volume!
    
    gotchas, as per netapp:
    "There is no way to tell how much space will be freed by deleting a particular snapshot or group of snapshots."
    
    

    DeDup A/SIS

    Advance Single Instance Storage (ie DeDuplication).
    DeDuplication finds duplicate data and collapse them into a single unit. NetApp A/SIS works on the block-level (4KB), and operates in the background for individual FlexVol (not usable on Traditional Volume). Like snapshot that have inodes pointing to same block, SIS use the same tech to reduce storage need. "same" block are indexed by hash, and "sameness" is verified via a byte-by-byte comparison before re-org of the inode pointers to free space.

    Performance impact:
  • File read just traverse thru a series of blocks in the i-node map. Random read is same. Sequential read may no longer be sequential, but large number of client request hardly makes read request really sequential anymore.
    Unlike EMC NS-series (as of Celerra v5.6), NetApp's dedup does not bundle together with compression, so there is no "re-hydration" time when accessing files (due to de-compression).
  • Write operations seems to take a real-time impact if SIS is turned on. Once SIS is on (and started), all write generate fingerprint on the fly and the info written to the change log. This calculation takes cpu power. Won't be impactful on system with less-than 50% load, but busy system can see degradation from 15% to 35% on FC disk.
    Page 6 of TR-3505:
    In real time, as additional data is written to the deduplicated volume, a fingerprint is created for each new block and written to a change log file. When deduplication is run subsequently, the change log is sorted and its sorted fingerprints are merged with those in the fingerprint file, and then the deduplication processing occurs.
    Note that there are really two change log files, so that as deduplication is running and merging the new blocks from one change log file into the fingerprint file, new data that is being written to the flexible volume is causing fingerprints for these new blocks to be written to the second change log file. The roles of the two files are then reversed the next time that deduplication is run.
    Page 15 of TR-3505:
    If the load on a system is low—that is, for systems in which the CPU utilization is around 50% or lower—there is a negligible difference in performance when writing data to a deduplicated volume, and there is no noticeable impact on other applications running on the system. On heavily used systems, however, where the system is nearly saturated with the amount of load on it, the impact on write performance can be expected to be around 15% for most NetApp systems. The performance impact is more noticeable on higher-end systems than on lower-end systems. On the FAS6080 system, this performance impact can be as much as 35%. The higher degradation is usually experienced in association with random writes. Note that these numbers are for FC drives; if ATA drives are used in a system, the performance impact would be greater.
  • Real dedup workload (finding duplicate block) can be scheduled to run at night or run on demand when sa knows filer is not busy.

    SIS won't operate on block marked by a snapshot, so saving maybe low when sis is turned on, till old snapshot expires. It is recommended to run sis before taking snapshot.
    
    sis on /vol/unixhome
    sis start -s /vol/unixhome # run scan for the first time (generate fingerprint)
    sis status   # show status and progress of scan if running
    df -s    # report on saving by dedup
    sis config    # see when sis is scheduled to run
    sis config -s auto /vol/home # use "auto" for when to rescan (when change amount is high)
        # recommend enable on all volume to reduce concurrent scan at mid-nite.
    
    sis off /vol/unixhome  # disable dedup.  stops fingerprint from being generated and written to change log
        # presumably with just this, write perf degradation should stops.
    sis undo /vol/unixhome  # recreate dedup block, delete fingerprint db when done.
        # use "priv set diag" to enter diag mode to run "undo".
    
    
    On a really busy FS but has slow cycles once in a while, perhaps dedup can result in no perf degradation yet save space:
    - sis on FlexVol
    - sis start -s FlexVol
    - sis off
    - (work)
    - sis start ...  (when system is idle)
    - sis off  (once scan is complete and busy working for user req again)
    
    Ref: TR-3050: NetApp Deduplication for FAS and V-Series Deployment and Implementation Guide

    NFS

    
    (/) etc/export
    is the file containing what is exported, and who can mount root fs as root.  Unix NFS related only.
    
    /vol/vol0        -access=sco-i:10.215.55.220,root=sco-i:10.215.55.220
    /vol/vol0/50gig  -access=alaska:siberia:root=alaska
    
    Unlike most Unices, NetApp allow export of ancestors and descendants.
    
    other options:
    -sec=sys # unix security, ie use uid/gid to define access
      # other options are kerberos-based.
    
    Besides just having export for nfs and ahare for cifs, 
    there is another setting about fs security permission style, nfs, ntfs, or mixed.  
    this control characteristic of chmod and files ACL.
    
    Once edit is done, telnet to netapp and issue cmd:
    exportfs -a  # re-add all exports as per new etc/export file
    exportfs -u  # unexport everything.  Careful!
    exportfs -u vol/vol1 # unexport vol1 (everything else remains intact)
    exportfs -r  # remove all exports that are no longer listed in etc/exports, maintain those that are still listed
       # -r is NOT the same as -au!
    
    The bug that Solaris and Linux NFS seems to exist on NetApp also.
    Hosts listed in exports sometime need to be given by IP address, or an
    explicit entry in the hosts file need to be setup.  Somehow, sometime
    the hostname does not get resolved thru DNS :(
    maybe it is a dns-cache poisoning problem...
    
    
    options nfs.per_client_stats.enable on
     # enable the collection of detained nfs stat per client 
    options nfs.v3.enable  on
    options nfs.tcp.enable on
     # enable NFS v3 and TCP for better performance.
    
    nfsstat  # display nfs sttistics, separte v2 and v3
    nfsstat -z  # zero the nfsstat counter
    nfsstat -h  # show detailed nfs statistics, several lines per client, since zero
    nfsstat -l  # show 1 line stat per client, since boot (non resetable stat)
    
    

    NIS domain

    changing NIS domain. no reboot should be necessary
    
    options nis.enable   off
    options nis.domain   new.nis.dom
    options nis.servers  10.0.91.44,10.0.91.82
    options nis.enable   on
    
    

    CIFS

    
    cifs disable  # turn off CIFS service
    cifs enable
    cifs setup  # configure domainname, wins.  only work when cifs is off.
    cifs testdc  # check registration w/ Windows Domain Controller
    
    
    
    cifs shares     # display info about all shares
    cifs shares -add sharename path -comment desc # create new share and give it some descriptive info
    cifs shares -change shrname -forcegroup grpname # specify that all cifs user will use a forced unix group on Unix-style FS.
          # this is for both read and write, so the mapping unix user need not be 
          # defined in this forcegroup in passwd or group map/file.
          # the groupname is a string, not gid number, this name need to be resolvable
          # from NIS, LDAP, or local group file.
    cifs shares -change shrname -umask 002     # define umask to be used.
    
    cifs access -delete wingrow  Everyone    
     # by default, share is accessible to "everyone" (who is connected to the domain)
     # above delete this default access
     # Note that this is equiv to exports, not file level ACL
    cifs access wingrow "authenticated users"  "Full Control" 
     # make share usable by authenticated users only
    cifs access it$ AD\Administrator "Full Control"    
     # make share "hidden" and only give access to admin  
     # (not sure if can use group "administrators")
    
    
    
    cifs sessions ...    # list current active cifs connections
    
    options cifs.wins_servers
     list what WINS server machine is using
    
    ifconfig  wins # enable  WINS on the given interface
    ifconfig  -wins # disable WINS on the given interface
    
    # WINS registration only happens when "cifs enable" is run.
    # re-registration means stopping and starting cifs service.
    # enabling or disabling wins on an interface will NOT cause re-registration
    
    etc/lslgroups.cfg  # list local group and membership SSID 
       # don't manually edit, use windows tool to update it!
    
    
    
    wcc  wafle cache control, oft use to check windows to unix mapping
     -u uid/uname uname may be a UNIX account name or a numeric UID
     -s sid/ntname  ntname may be an NT account name or a numeric SID
    
     SID has a long strings for domainname, then last 4-5 digits is the user.
     All computer in the same domain will use the domain SID.
    
     -x remove entries from WAFL cache
     -a add entrie
     -d display stats
    
    
    options wafl.default_nt_user username 
     # set what nt user will be mapped to unix by def (blank)
    options wafl.default_unix_user username
     # set what unix username will be used when mapped to NT (def = pcuser)
    
    
    user mapping b/w nt and unix, where user name are not the same.
    It is stored in the (/) etc/usermap.cfg file.
    
    NT acc   unix acc username
    Optionally, can have <= and => for single direction mapping instead of default both way.
    eg:
    
    tileg\Administrator      root
    tileg\fgutierrez         frankg
    tileg\vmaddipati         venkat
    tileg\thand              thand2
    tileg\thand              thand1
    tileg\kbhagavath         krishnan
    
    *\eric   => allen
    ad\administrator <= sunbox:root
    nt4dom\pcuser  <= tinh
    
    This mapping will be done so that users will gain full permission of the files under both env.
    a lot of time, they get nt account first, and thus end up with read only access to their 
    home dir in windows, as they are mapped as non owner.
    
    < !-- -- >
    usermap.cfg does get read by windows user writting to unix-style FS.
    Be careful when doing M-1 mapping.  While this may allow many unix user to use same NT account
    to gain access to NF-style FS as part of "everyone", the reverse access would be problematic.
    eg:
    hybridautoAD\tho sa
    hybridautoAD\tho tho
    While unix sa and tho maps to same user on windows, when Windows tho login, and try to write
    to UNIX-style FS, permission will assume that of unix user sa, will not be tho!!
    
    It maybe possible to use <== and ==> to indicate direction of mapping ??
    
    
    (??) another map does the reverse of windows mapping back to NFS when fs is NFS and access is from windows.
    (or was it the same file?).  It was pretty stupid in that it needed all users to be explicityly mapped.
    
    
    NetApp Web Interface control the share access (akin to exports)
    Windows Explorer file namager control each file ACL (akin to chmod on files).
    
    Can use Windows Manager to manage NetApp, general user can connect and browse.
    User list may not work too well.
    
    
    
    

    CIFS Commands

    
    cifs_setup   # configure CIFS, require CIFS service to be restarted 
        # - register computer to windows domain controller
        # - define WINS server
        
    options cifs.wins_server # display which WINS server machine is using
        # prior to OnTap 7.0.1, this is read only
    
    cifs domaininfo   # see DC info
    cifs testdc   # query DC to see if they are okay
    cifs prefdc print  # (display) which DC is used preferentially
        
    
    WINS info from NetApp, login req: http://now.netapp.com/Knowledgebase/solutionarea.asp?id=3.0.4321463.2683610
    # etc/cifsconfig_setup.cfg 
    # generated by cifs_setup, command is used to start up CIFS at boot
    # eg:
    cifs setup -w 192.168.20.2  -w 192.168.30.2 -security unix  -cp 437
    
    # usermap.cfg
    
    # one way mapping
    *\lys => lks
    NETAPP\administrator <= unixhost:root
    
    # two way mapping
    WINDOM\tinh tin
    
    ## these below are usually default, but sometime need to be explicitly set
    ## by some old NT DC config.
    WINDOM\* == * # map all user of a specific domain
    # *\*    == *   # map all user in all domains  
    

    Command

    Commands for NetApp CLI (logged in thru telnet/ssh/rsh)
    
    ? = help, cmd list
    help cmd 
    
    
    dns info  # display DNS domain, 
       # extracted from WINDOWS if not defined in resolve.conf
    options dns.domainname  # some /etc/rc script set domain here
    
    sysconfig -v 
    sysconfig -a # display netapp hw system info, include serial number and product model number
    sysconfig -c # check to ensure that there are no hardware misconfig problem, auto chk at boot
    
    sysstat 1  # show stats on the server, refresh every 1 sec.
    
    df -h
     similar to unix df, -h for "human readable"
     .snapshot should be subset of the actual volume 
    
    df -s report sis/dedup saving on a volume
    
    ndmpd status
     list active sessions
    
    ndmpd killall
     terminate all active ntmpd sessions.
     Needed sometime when backup software is hung.  kill ndmpd session to free it.
    
    useradmin useradd UID
     add new user (to telnet in for admin work)
    
    useradmin userlist
     list all users
    
    
    options   # list run time options.
    options KEY VALUE  # set specific options
    
    #eg, autosupport with email:
    options autosupport.mailhost  mailhost.myco.com,mailhost2.myco.com
     # comma list of up to 5 host (tried till one work?)
    options autosupport.support.transport smtp
    options autosupport.support.to autosupport@netapp.com
    options autosupport.to tin.ho@e-ville.com,bofh@e-ville.com
     # Change who receives notification emails.
    options autosupport.doit case_number_or_name
     # Generate an autosupport email to NetApp (to predefined users).
    
    # autosupport via web  (but then local admin don't get emaiL?)
    options autosupport.support.transport https
    options autosupport.support.proxy     na-useh-proxy:2010
    
    
    #find out about ntp config:
    cat registry| grep timed
    options.cf.timed.max_skew=
    options.service.cf.timed.enable=off
    options.service.timed.enable=on
    options.timed.log=off
    options.timed.max_skew=30m
    options.timed.min_skew=10
    options.timed.proto=ntp
    options.timed.sched=hourly
    options.timed.servers=time-server-name   # time server to use
    options.timed.window=0s
    state.timed.cycles_per_msec=2384372
    state.timed.extra_microseconds=-54
    state.timed.version=1
    
    rdfile   read data file (raw format)
      eg rdfile /etc/exports
      inside telnet channel, will read the root etc/exports file to std out.
      equiv to unix cat
    
    wrfile  write stdin to file  
      not edit, more like cat - > file kind of thing.
    
    
    FilerView
    FilerView is the Web GUI. If SSL certificate is broken, then it may load up a blank page.
    secureadmin status
    secureadmin disable ssl
    secureadmin setup -f ssl # follow prompt to setup new ssl cert
    
    SSH
    To allow root login to netapp w/o password, add root's id_dsa.pub to vol1/etc/sshd/root/.ssh/authorized_keys
    Beware of the security implications!

    Config Files

    
    all stored in etc folder.
    resolve.conf 
    nsswitch.conf
    
    # etc/exports
    
    /vol/unix02  -rw=192.168.1.0/24:172.27.1.5:www,root=www
    /vol/unix02/dir1 -rw=10.10.10.0/8
    
    # can export subdirs with separate permissions
    # issue exportfs -a to reread file
    
    

    Logs

    (/) etc/messages.* unix syslog style logs.  can configure to use remote syslog host.
    
    (/) etc/log/auditlog
     log all filer level command.  Not changes on done on the FS.
    
    
    
    The "root" of vol 0,1,etc in the netapp can be choose as the netapp root and store the /etc directory, 
    where all the config files are saved.  eg.
    /mnt/nar_200_vol0/etc
    /mnt/na4_vol1/etc
    
    other command that need to be issued is to be done via telnet/rsh/ssh to the netapp box.
    
    
    < ! - - - - >

    Howto

    Create new vol, qtree, and make access for CIFS
    vol create win01 ...
    
    qtree create   /vol/win01/wingrow
    qtree security /vol/win01/wingrow ntfs
    qtree oplocks  /vol/win01/wingrow enable
    cifs shares -add wingrow /vol/win01/wingrow -comment "Windows share growing"
    #-cifs access wingrow ad\tinh "Full Control"  # share level control is usually redundant
    cifs access -delete wingrow  Everyone
    cifs access wingrow "authenticated users"  "Full Control"
    
    # still need to go to the folder and set file/folder permission,
    # added corresponding department (MMC share, permission, type in am\Dept-S
    # the alt+k to complete list (ie, checK names).
    # also remove inherit from parent, so took out full access to everyone.
    
    

    Network Interface Config

    vif = virtual interface, eg use: create etherchannel
    
    link agregation (netapp typically calls it trunking, cisco EtherChannel).
    
    single mode  = HA fail over, only single link active at the same time.
    multi mode = Performance, multiple link active at the same time.  Req swich support
        Only good when multiple host access filer.  Switch do the
        traffic direction (per host).
    
    Many filer comes with 4 build in ethernet port, can do:
    
    2 pair of multi mode (e0a+e0b, e0c+e0d).
    then single mode on the above pair to get HA, filer will always have 2 link
    active at the same time.
    
    
    pktt start all -d /etc  # packets tracing, like tcpdump
    pktt stop all
    # trace all itnerfaces, put them in /etc dir, 
    # one file per interface.
    # files can be read by ethereal/wireshark
    
    
    

    Backup and Restore, Disaster Recovery


    NetApp supports dump/restore commands, a la Solaris format. Thus, the archive created can even be read by Solaris ufsrestore command.
    NetApp championed NDMP, and it is fast. But it backup whole volume as a unit, and restore has to be done as a whole unit. This may not be convinient.
    volcopy is fast, it will also copy all the snapshots associated with the volume.

    DFM

    Data Fabric Manager, now known as ...
    typically https://dfm:443/
    Ment to manage multiple filer in one place, but seems to just collect stats. Kinda slow at time. And to get volume config, still have to use FilerView, so not one-stop thing. ==> limited use.

    Links

    1. RAID_DP

    History

    
      - ca 2000 = Alpha Chip
    OnTap 5.3 - 
    OnTap 6.1  - 2003?  Intel based ? 
    OnTap 6.5 - 2004?  RAID_DP        Dual Paritiy introduced here.
    OnTap 7.0 - 2005?  Aggregate introduced here.
    OnTap 7.3.1 - 2008?  DeDuplication (a/sis) single instance storage available. 
    
    


    [Doc URL: http://www.grumpyxmas.com/netapp.html]
    [Doc URL: http://sn50.user.sonic.net/psg/netapp.html]
    [Doc URL: http://www.cs.fiu.edu/~tho01/psg/netapp.html]

    (cc) Tin Ho. See main page for copyright info.
    Last Updated: 2007-04-27, 2009-04-01


    "LYS on the outside, LKS in the inside"
    "AUHAUH on the outside, LAPPLAPP in the inside"
    psg101 sn50 tin6150