## $Id: INSTALL,v 4.5 2000/03/21 05:23:58 vikas Exp $

INSTALLATION INSTRUCTIONS FOR 'NOCOL' v4.3
==========================================

NOTE:	You will have to edit & customize some of the PERL monitors manually
	(in perlnocol). See the perlnocol/README for more information.

NOTE:	Sample config files are also provided for each monitor. These are
	copied over in the nocol/etc/samples directory during installation.
	Copy these to nocol/etc/ and edit for each monitor.

1. Plan on a location for installing the entire software. It is recommended
   that all required directories pertaining to NOCOL be under one directory
   (say /usr/local/nocol)  with perhaps symbolic links to the DATA directory.

2. Run 'Configure' in the top level NOCOL source directory:

	sh Configure  (or ./Configure)

3. Run 'make' (might want to save the output using 'make >& make.out')

4. If 'multiping' fails to compile on your system, you will have to edit
   the pingmon/Makefile and set IPPING to your system's 'ping' location and
   also comment out the 'PROGCDEFS = -DMULTIPING' line.

   Make sure that the output of your systems 'ping' command matches as
   described below (else you will need to make minor modifications in the
   'pingmon/poll_sites.c' file (the area to modify is well commented so
   it should be easy).

	solar>  /usr/etc/ping -s abc.foo.com 1000  5 | tail -2
	  5 packets transmitted, 5 packets received, 0% packet loss
	  round-trip (ms)  min/avg/max = 4/4/5

   No changes are needed if you are using the provided multiping (or rpcping)
   programs for IP (or RPC).

   If you get an error 'undefined symbol _strerror()' when linking with
   -lresolv, edit lib/Makefile.mid and add strerror.o to OBJS

5. The default NOCOL logging port is defined in 'noclog.h' to 5354.
   Also, if using hostmon, the default data port is defined in the hostmon
   modules as port 5355.

   You can change these ports in these files if you want to use some other
   port number (preferably >1023 so that the programs do not have to run 
   as 'root'). Then add the following lines in your '/etc/services' file
   (mainly for inetd- the programs use the default ports if there is no
   entry in the /etc/services file).

	noclog		5354/tcp	# noclogd with TCP
	noclog		5354/udp	# noclogd with UDP
	hostmon		5355/tcp	# hostmon uses TCP

6. Make sure you can write to the destination directory, and then:

	make install
	su
	make root	# to install etherload, multiping, trapmon

7. Look at the config files in the $ROOTDIR/etc/samples directory, and 
   edit/save them to the $ROOTDIR/etc directory. List the hosts (running
   the monitors if distributed on various systems) which can log to 'noclogd'
   in the config file for noclogd.
   If saving to log files, make sure that the proper LOG directory exists for
   the log files to be created (check in noclogd-conf and log-maint).
   Edit all other config files for your site.

    PRIOR TO v4.0, IPPINGMON used the config file name of 'ipnodes'. Rename
    'ipnodes' to 'ippingmon-confg'. Furthermore, the variable name for
    ippingmon was 'Reachability' in the previous versions, it is now
    "ICMP-ping" - please change any local customized scripts that used
    to assume the old variable name.

8. Edit the following scripts (these are run from your CRONTAB).

   - '$ROOTDIR/bin/keepalive_monitors' checks to see if the various monitors
     are running. Edit and set the values of PROGRAMS1, HOST1, etc.
     You can also distribute the monitors on multiple systems and share 
     the /nocol disk via NFS. List all the monitors that you want to run
     per system in this file.
     It is run from the crontab every 30-60 minutes.

   - '$ROOTDIR/bin/notifier' sends email listing sites that have been
     critical between N to N+1 hours. You can use this program to send
     email to senior personnel in your staff if sites are down for more
     than a stipulated time (as manager's often tend to request).
     It is run from the crontab every hour.

   - '$ROOTDIR/bin/log-maint' cycles old logs and also runs the logstats
     program to generate statistics (it sends a HUP to the noclogd daemon).

   Create the mail aliases that you had selected for OPSMAIL and CRITMAIL.

9. Test 'noclogd' by starting it up in debug mode (-d). See if it
   complains about anything. You will have to edit noclogd-conf to
   set the location of the log files that are created. Check logging
   using the perl script 'perlnocol/testlog'. Stop  'noclogd' after testing.

   Then install the 'bin/crontab.nocol' file in the nocol users' crontab.
   (usually su nocol ; crontab crontab.nocol). This will run 
   'keepalive_monitors' which starts up noclogd and other monitors. If
   you want to run keepalive_monitors directly instead of cron for now,
   run it as the nocol user.

   Use 'netconsole -l 4' to see if any data is being collected under the 
   DATA directory. Look in the $ROOTDIR/etc/*.error files for any errors.

   REMEMBER that the monitors log events to noclogd only when the state
   of the event CHANGES. So nothing might be logged to noclogd if all
   the sites remain at the same state (up/down) and threshold level.

10. You can add user 'nocol'  to your password file to allow anyone to
    log in as user 'nocol' and see the state of the network. A typical
    entry is:

	nocol::65534:65534:Network Monitoring:/tmp:/nocol/bin/netconsole

    All signals are trapped by the 'netconsole' display program and cause
    it to terminate.

11. To install the Web interface (webnocol/)
  - Check the various 'SET_THIS' lines in both genweb.pl and webnocol.cgi
    which have been copied over into your $ROOTDIR/bin/
  - Edit the &doTroubleshoot() function and check the troubleshooting
    commands in webnocol.cgi
  - Run $ROOTDIR/bin/genweb.pl from your crontab every minute:
	* * * * * /nocol/bin/genweb.pl >/dev/null 2>&1
  - Create a link called index.html to 'Critical.html' in your web tree.
  - Copy over the entire 'gif' directory structure to the same directory
    where you are generating the html pages ($webdir in genweb.pl)
  - Install webnocol.cgi in your 'cgi-bin' directory.
  - Create a null updates file and a null cookie file owned by your web daemon
	cp /dev/null $ROOTDIR/etc/updates
	cp /dev/null $ROOTDIR/etc/webcookies
	chown httpd $ROOTDIR/etc/updates $ROOTDIR/etc/webcookies
  - Create a $ROOTDIR/etc/webusers file using the sample as an example. You
    can generate encrypted password using the utility script docrypt.pl

12. When you make changes to the various config files, you have to HUP the
    processes which will restart the 'daemons'. Note that currently there
    is no way to pick up only the changes in the config file,, monitoring
    will need to be restarted in order to pick up config file changes.

PERLNOCOL
---------

There is a PERL interface for developing additional NOCOL monitors. To use 
this, you need to have PERL installed on your system.

1.  If using 'hostmon', you need to run the standalone 'hostmon-client'
    programs on the machines you want monitored, and run the 'hostmon'
    process on the 'nocol' server. Check the '@permithosts' line in the
    'hostmon-client' program to ensure that it allows the nocol host to
    connect to the hostmon-client processes. Then copy over the entire
    'perlnocol/hostmon-osclients' directory to all the Unix hosts that
    you want monitored. These client routines do not use nocollib.pl
    and do not use any configuration file.
    Start up hostmon-client at boot time by making an entry in your
    /etc/rc.local or equivalent file. As an example, you can do the
    following on all your Unix hosts you want monitored:
	cd $ROOTDIR/bin
	rsh host1 mkdir /usr/local/nocol
	rcp -r hostmon-osclients host1:/usr/local/nocol
	rlogin host1
	# Now edit your /etc/rc.local or whatever system startup script
	# and add the line:
	#   (cd /usr/local/nocol/hostmon-osclients; ./hostmon-client)
	# Run this command manually for now since you are not rebooting
	# your machine.

    The 'hostmon' process on the nocol host will be restarted by the
    'keepalive_monitors' process. Edit the hostmon-confg file.

2.  To use 'snmpmon',  edit and set the thresholds in the snmpmon-confg
    file. List the devices that need to be monitored in the 
    'snmpmon-client-confg'  file and run 'snmpmon-client'. SNMP data is
    generated in the '/tmp/snmpmon_data' directory.

    You can probably have a number of snmpmon-clients running on different
    systems and rcp the datafile over to the host running the server 
    'snmpmon' program periodically from cron. If you do this, then you
    will have to compile and edit the locations of snmpwalk and mib-v2.txt
    in the perl script.

    (A new monitor snmpgeneric can also be used instead of snmpmon.

3.  If the monitor that you want to run uses 'rcisco', then enter your 
    router's password in it and install it in nocol/bin  with mode 710.
    Alternatively, you can use the 'tcpf.c' program to run a remote
    telnet command.
    Edit the SNMP community string in any perl script if so indicated in 
    the perlnocol/README (if it uses snmpwalk).

4.  Create the config files under $ROOTDIR/etc/. Samples are in the
    $ROOTDIR/etc/samples subdirectory.

5.  For troubleshooting, set the $debug and $libdebug values to '1' or
    higher. You can also send a SIGUSR1 signal to running modules to
    change the debug level (increases to max and then resets on each
    SIGUSR1).

6.  Check the size of the event_t structure (see TROUBLESHOOTING item below).

7.  There is a X-window Tcl/Tk interface developed by Lydia Leong
    (ndaemon and tkNocol). You need 'tixwish' in order to run
    tkNocol. You should run ndaemon on the nocol host (this 
    listens on TCP port 5005). You can then run 'tkNocol' from
    any host, and it connects to ndaemon. THERE IS NO ACCESS CONTROL
    in ndaemon, so you must ensure that only permitted hosts (running
    tkNocol) can access this host through the firewall. This can also
    run on Windows machines if you have tixwish installed.

8.  There is a Windows 95/NT interface for viewing data developed by
    Jason Wright (jason@thought.net) on http://www.thought.net/jason/


TROUBLESHOOTING
---------------

1. Some warnings are to be expected, but there should be no major errors.

2. If the errors are about include files or variable types, look for the file
   that is being included under the  /usr/include sub-directories. The
   various systems love to move include files back and forth between the
   include and the include/sys directories (especially 'time.h').

3. For the nameserver monitor, old versions of the resolver library might
   complain. Some include files defined the '_res' variable differently, so
   try changing all occurences of '_res.nsaddr' to '_res.nsaddr_list[0]'
   in the src/nsmon/nsmon.c module (look in your /usr/include/resolv.h).
   Make sure that the 'libresolv' library exists while linking.

   Newer nameserver/resolver libraries are called '-lbind' instead of
   '-lresolv', so if you have installed the latest version of bind,
   change all references in the Makefile (or Configure) to '-lbind'
   instead of '-lresolv'.

4. For trapmon, the CMU SNMP library is used. Make sure that it was properly
   built under 'src/cmu-snmp/snmp'. If not, try following the instructions
   is cmu-snmp/README to build and install the library in the local
   directory.

5. Most monitors have a '-d' option for debugging, or create error
   files in the $ROOTDIR/etc.

6. If you get a 'h_addr_list[0]' not defined error, simply edit 
   nocol.h and add the following line in it:

	#define h_addr 1

  This is because of the difference in the hostent() structure of netdb.h
  in very old systems.

7. Check if the regular expressions in the &dotest() routine in the PERL
   modules need any changes for your site.

8. If you get strerror() undefined errors, try adding strerror.o to
   NEEDOBJS in the lib/Makefile.mid. If you get pfopen() errors in
   etherload on DEC OSF1, then add pfopen.o to the etherload/Makefile.mid
   OBJS definition.

9. In PERLNOCOL, watch out for the padding in the '$event_t'  template.
   C compilers tend to align the fields of structures on even byte boundaries,
   so you might have to add some additional 'null' padding using 'x' depending
   on your system architecture. Set $libdebug = 1 to see the size of the 
   $event structure. The size of the data files produced by the C monitors
   should be a multiple of the perl $event structure. The C utility program
   'show_nocol_struct_sizes' can be used to see the event struct sizes
   in the C modules.

10. Check the syntax of the ping() function in the nocollib program 
    to make sure that the command line arguments are okay for your system.

Best of luck. Comments to 'nocol-info@jvnc.net' and bugs to 'vikas@navya.com'

The README file has more information. For an overview, look at the file
doc/nocol-overview.8.


	Vikas Aggarwal
	(vikas@navya.com)
	January 1997
	-----------------

