Dead NDS Replicas
Home Consultancy Support Contracts Client History CV Downloads / Info Links Contact Us

Return
Useful notes taken from a Novell Technical Document on how to remove all replicas from a server when the normal tools won't work.

How to Manually Remove All Replicas From a Server

DSREPAIR -XK2

NOVELL DOCUMENT ID: 10026822 (Solution: 1.0.55947281.2542456)

GOAL:

How to Manually Remove All Replicas From a Server; DSREPAIR -XK2

Manually Removing Replicas - DSREPAIR -XK2

FACT:

Novell NetWare 5.0, Novell NetWare 4, Novell Directory Services

SYMPTOM:

  • Replica is stuck in a dying or DEAD state
  • Newly added replica stuck in new or dying state and not advancing.
  • Server can not communicate properly with the rest of the tree.
  • Server holding up synchronization because of object corruption.
  • Server getting -672 error because it's replica ring or rings are inconsistent with the other servers.
  • Error: -761
  • After creating or renaming an object, the object name is changed to "1_2" (corrupt replica)
  • Replica Stuck in Dying State, which is the only replica on the server
  • Server is in high utilization due to corrupt NDS replica

CAUSE:

Unknown and/or corrupt replica / NDS data corruption. Usually due to a power outage, critical abend, communications problems, or hardware failure.

NDS replicas can become corrupt for many reasons. Occasionally, partition operations become stuck and need to be manually fixed.

FIX:

WARNING: Only follow this procedure under the advisement of Novell Technical Support. If performed incorrectly, this procedure can cause major damage to a directory tree. Novell does not support this procedure unless under the specific recommendations of a Technical Support Engineer.

There are additional considerations when using this procedure on a server that is running NDS8. Contact Novell Technical Support for more information.  The following steps will force all replicas off of a server and clean up the replica rings. This process has many implications:

A) If the only real replica of a partition exists on this server, all data for objects in this partition will be permanently lost.

B) Once started, this process cannot be halted.

C) If not followed correctly, this procedure can cause more damage to the NDS tree.

D) Do not perform this procedure on a server running NDS 8.x without first contacting Novell Technical Support.

F) On the menu that comes up select REMOVE THIS SERVER FROM THE REPLICA RING.

G) Enter the ADMIN username and password.

H) Type the words I AGREE on the next screen.

I) Repeat the above steps for each replica that is on the defective server.

Often, a single server may hold copies of multiple partitions. This same server can then be used when removing the defective server from the replica rings. The advantages to this situation are that you will only need to authenticate to the server once, and a heartbeat will only need to be forced once all of the replica rings have been cleaned up (see below).

J) For each server that you used to remove the defective server from the replica ring, force a synchronization heartbeat. This will help to guarantee that all servers in the replica ring will be notified of the change and will update their individual databases. Normally, this information will automatically be forwarded to all the necessary servers. This is manually done with the following commands:

SET DSTRACE=+S
SET DSTRACE=*H

5. From the server console of the defective server, load DSREPAIR -XK2 | Advanced Options | Repair Local DS Database. The only two options that need to be set to YES are "Check Local References" and "Rebuild Operational Schema" Set all other options to NO. Then press F10 to start the repair. The -XK2 switch removes all replicas from the server when a local database repair is performed.  

This operation does not remove Directory Services from the server. No server information or references should be lost in this process. However, if bindery services were enabled on the server, they will not work properly until the proper replicas are added back.  

This procedure also resets all of the objects that were local on the server to externally referenced objects in a "reference" state. This allows the server to redirect calls to objects to other servers that hold actual copies of those objects. On certain older versions of DSREPAIR, -XK3 is needed to manually set externally referenced objects to a reference state. It does not hurt to add -XK3 after -XK2 just in case you are using an older DSREPAIR.NLM.

6. The repair may take anywhere from 30 seconds to over an hour, depending on how many replicas were stored on the server and the speed of the hardware.  The repair will take about as long as a typical database repair would have previously taken on this server. When the repair is completed, save the database and exit DSREPAIR. It does not matter how many errors were found in the repair.

7. The defective server now needs to have accurate references to the real objects that were deleted from the server's database.  These references are verified by the backlink process, which runs by default every 13 hours on a server. Until the process is completed, users may have trouble authenticating to the server. Instead of waiting up to 13 hours, you can force the process by entering the following commands:

SET DSTRACE =NODEBUG
SET DSTRACE =+BLINK (Turns DSTRACE screen ON and allows you to see the backlink process)
SET DSTRACE =*B (Manually starts the backlink process)

8. Toggle to the directory services screen and watch for message "Finished checking backlinks successfully" or "Finished checking backlinks succeeded."  This message indicates that the defective server now holds the proper information. Users should be able to authenticate to the server without any problems.

9. At this point, the replicas are off the problem server and the tree should be functioning normally. Replicas may now be added back to the server, as necessary. Occasionally, multiple servers need this procedure performed in order to clean up a tree. Contact Novell Technical Support for help in determining which servers may need this procedure performed. It is also advisable to run through a NDS Health Check Procedures, Solution 4.0.4561524.2239956, to verify that the tree is  synchronizing properly.

Note:

In Novell Directory Services 8 (eDirectory), the -RC and -XK2 switches all need to be run separately in the given order. Be careful when doing this procedure to verify that there are good real copies of all replicas on the problem server on other servers as this will remove all replicas on the server without prompting the user..

Text links: Return ]