no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.

@@ Line 1: / Line 1: @@
+<code>
+[09] WHAT SHOULD I DO IF A SYSTEM CRASHES OR LOCKS UP?
+     Hopefully this will not happen at all to you, but if you experience
+     'lock ups' or 'freezes', please follow these steps to help prevent
+     your own data loss.
+     Also, it is important to note that you do not have a direct connection
+     to SDF and are mostly likely hopping through 10 or more networks to
+     get to SDF.  You can use ping and traceroute to measure lag between
+     your computer and SDF.  So, your experience of lag on SDF is subjective
+     and it is very important for you to understand that.
+     Typically a lockup will occur when you are trying to access a
+     file that is resident on the fileserver.  For instance, say you
+     are trying to cat a file and instead of seeing the contents you
+     get either nothing or a message similar to:
+     ol1:/sys: not responding
+     Be patient, the fileserver will recover shortly and your task
+     will be completed .. you will probably see:
+     ol1:/sys: is alive again
+     which means your request will actually begin to be processed.
+     During the hang time, you can use ^T (CTRL T) to display the
+     status of your job .. for instance:
+     load: 2.04  cmd: tail 12966 [select] 0.00u 0.00s 0% 808k
+     [select] is the current state of the process id 12966 which
+     is the 'tail' program.  If the system is waiting on actual
+     disk I/O, you'll probably see [biowait].  In cases of a hang
+     you may see either [nfsrcvlk] (Network File System Received Lock)
+     or [vnlock] (Virtual Node Lock) which the system will usually
+     recover from, but can be telling of a serious resource problem
+     on the NFS client should this state be prolonged.
+     In the event that the fileserver becomes unavailable, it is
+     important that you do not become impatient and interrupt, quit
+     or suspend your jobs (^C, ^\ or ^Z) but rather, wait them out.
+     If you are patient your chances of losing data will be
+     significantly reduced.  Usually the fileserver will respond
+     within a few seconds, but usually no longer.  In the case when
+     it is the NFS client's problem (vnlock for more than say 20
+     seconds) that particular host will most likely need to be reset.
+     More on this.  SDF is pushing NetBSD to its limits and we are
+     currently (2003-2004) doing quite a bit of investigation with
+     the uvm/vfs/vnode code developers to help NetBSD become scalable
+     in high usage situations such as the loads we experience on SDF.
+     Solutions we find will be incorporated into the public code.
+</code>
+[[misc|back]]