Migrating off of NetWare

| 5 Comments
It has been around a year since we did the heavy lifting of migrating off of NetWare and retiring our eDirectory tree. By this point last year we had our procedures in place, we just needed to pull the trigger and start moving data around. I was asked to provide some hints about it, but the mail bounced with a 550-mailbox-not-found error *ahem*.

Because it's such a narrowly focused topic, and the WWU people who read me lived through it and therefore already know this stuff, I'm putting the meat of the post under the fold.

You're welcome.

It turned out that our migration was pretty painless. Server 2008 is close enough to feature-parity with NetWare that only the lack of Salvage really has been noticed. Also, our NetWare/eDir environment was built in such a way that it made migration a lot easier than it otherwise could have been.

Prep-work
Before moving any data, there was a good chunk of prep work we had to get done.

  • Start syncing usernames and passwords between the two environments
  • Turn CIFS sharing on for our NW volumes
  • Work with Mac community to examine options
  • Create scripts to migrate/sync eDir group memberships to AD equivalents
  • Create scripts to copy and apply NTFS ACLs to volumes copied from NW
  • Create scripts to copy NetWare directory-quotas to Windows directory-quotas
  • Create an AD login script
  • Set up ShadowCopy schedules
  • Issue edicts about the availability of AD resources to non-AD machined (in specific, limited)
Before anything can be done we needed to get usernames and passwords synced between the eDir and AD environments. Happily for us, we put exactly this kind of system in place when we deployed Active Directory back in 2001 or so. We wrote our own system to do that since at the time there wasn't a whole lot out there that did it (with NDS for NT not working for AD, and DirXML being still obscure, we ended up going the custom route).

Turning CIFS on was the first step of the actual project. It enabled our Vista and Win7 clients to access our NetWare data without having to endure the Novell Client for Vista/Win7. Unfortunately, it wasn't that stable. Our early Win7 adopters got bit by this. Things just worked our didn't work. It didn't help that Win7 ships with the LanManAuthenticationLevel set to 'disable LM', as NetWare needs LM. This caused ILOs where the IP address of the lockout was the cluster node hosting the volume they were attaching to.

We then worked with our Mac community to identify what would work for them. NetWare had native AFP support, and that... didn't work so well on Windows. After doing some tests and working with people, it was decided that the OSX Samba implementation was robust enough that we could use normal SMB sharing for that community instead of AFP. This reduced our project complexity.

We were blessed in that a while back we mandated that all groups used for ACLs on our cluster had to be in a single OU, which coincidentally enforced name uniqueness. Those groups that were outside of that specific OU had to be manually copied and maintained by departmental sysadmins, but 95% of our groups were in this one OU.

The script I wrote was a two parter. it used Perl (the generic LDAP support is much better than PowerShell) to connect to eDirectory and pull down groups and memberships. It then split that data into two files, a list of groups, and a second one with group,membername information. These textfiles were then used by a PowerShell script to merge changes into Active Directory. The created groups were placed into a specific AD OU, and were all prefixed with "NW-" which made them unique per our naming convention.

We also had to write a script to copy permissions. We used TRUSTEE.NLM to export the permissions to a file, and that file was then parsed to set the NTFS equivalents. This required some hard decisions to make, as NW trustees allow things that NTFS doesn't, such as allowing people to browse from the root of the volume to a directory 4 levels deep where they were assigned rights, and creating file-system drop-boxes. The script uses the edirName to adName conversion the group-copy script set up.

The script to copy the directory quotas was pretty straight forward. Another TRUSTEE.NLM file was created to export dir-quota information, which was then applied to the file-system. There was one gotcha that we didn't catch until a few months later. NetWare will permit this:

\Finance = 4GB
\Finance\Admissions = 12GB

Where on Windows, the 'Admissions' folder would have a 4GB quota. We only had two instances of that, though.

The AD login script has been by far the hardest process, and is still evolving. Unlike eDir, AD has no native login-script syntax. It can fire a VBasic Script via the 'OnLogin' event by way of a Group Policy, which is nice as far as it goes, but there is no 1:1 equivalent to "IF Member Of "OU=Students" THEN map j:=wuf-class1/class1:" We had to create function-calls in the VBScript to replicate this functionality. At the moment, the login-script header, where all the functions are defined, is about 16KB in size. Half of that space is currently (as of this writing) devoted to setting up and managing the debug logging environment and dealing with the concurrency problem. AD is perfectly happy to run a login script twice, simultaneously.

One deficit with VBS is that when it comes to group-names it can only see the SamAccountName, which is limited to 20 letters.

Once we're 100% upgraded to at least Windows 7 we can use PowerShell. Until then, stuck with VBS.

Since Windows doesn't have Salvage, we leveraged Shadow Copies (also known as 'Previous Versions') for this. Unlike Salvage which'll keep any deleted file around so long as there is space, Shadow Copy keeps files extant at a certain discrete point in time. If a file is created an deleted between snapshots, we can't bring it back like we could on NetWare. We take snapshots three times a day during the week, and once during the weekend.

And finally, the politics. Upper management got buy-in from the right people we were able to state that we would not actively support connectivity to non-AD departments. Hooray.

Once all that was in place, it was time to move volumes.

During migration
The actual migration process was pretty straight-forward:
  1. A week before the migration on the Window server, map a drive to the NW server via CIFS.
  2. Run a robocopy mirror of the data (robocopy q:\ m:\shared\ /mir /r:1).
  3. Once the initial copy finished, once a day, do a delta mirror (same command).
  4. On the day before migration, run the permission copy script.
  5. Tell people who edit permissions to keep their hands clear of the volume until we tell them its OK.
  6. A few hours before the migration, do another delta-copy.
  7. At the migration time, change the login scripts in AD and eDir.
    1. AD: Change the mapping from pointing to the CIFS volume to the new Windows volume.
    2. eDir: Change the mapping to a 'net use' command to the new Windows volume.
  8. At this point, from the user's point of view the volume is migrated.
  9. Run the quota-copy script.
  10. Disable logins on the NetWare volume and terminate all connections but the known good ones, like mine.
  11. Run a final robocopy to pick up open files and new files.
  12. Dismount the NetWare volume.
  13. Tell the techs that the data is moved and they can touch permissions again.
Total downtime as far as the users are concerned? About 15 minutes. People left logged in overnight get a bum drive letter, but a simple logout/login cycle will fix that. Our helpdesks were notified of this well in advance.

While we were copying data we were in a dual-stack environment. Some drive-letters were on the NetWare cluster, others were on the Windows cluster. Because of the low impact of the data migration we were able to do it with evening migration windows during regular session.

At this point we started getting requests for timing on when people could remove the Novell Client from stations. Because of the instability of CIFS on NetWare we didn't formally recommend removal until after the last volumes were moved, but we did start working on a removal script.

It was during this time that we noticed one issue with the OSX SMB implementation. Specifically, certain versions of it couldn't browse the shares on the Failover Cluster. They'd go to something like \\msfs-class1\ and wouldn't get a list of shares where they would on Windows, but could still map to \\msfs-class1\class1\ directly. ATUS built a web-page for guiding Mac users to the correct connection strings to use. 10.5 had major problems, thus the page's mention of 10.4, but 10.6 resolved all of them.

Post migration
After we were done, there were a few other clean-up areas.
  • Printing. Most of our non-student printing was done by direct IP printing to printers. Apparently our desktop techs didn't like NDPS and therefore didn't use it much. Such printers didn't need help. We did have some NDPS printers, so I worked with individual area-techs to cut over people to either direct-IP (almost all of them) or base them off of the student print-server (a few of them).
  • LDAP Authentication. We had a large number of web-applications using eDir for their LDAP auth source. We kept our eDir replica servers up for quite some time to help people move to AD-LDAP auth instead of eDir-LDAP auth.
  • Departmental servers. We in ITS had our stuff migrated by April, but the departmental NetWare servers moved a lot later. June and July. This meant we had to keep our replica servers up. But eventually we did turn them off.
  • Domaining Workstations. This was ongoing, but it needed doing.
  • Upgrading Macs. 10.5 really didn't like working with the Failover Cluster, so we had to upgrade the 10.5 machines we had to 10.6. 10.4 worked just fine, funnily enough, so our older PPC clients have a working version as well.
  • Novell Client Removal. We created a script to remove it, but it wasn't easy, nor am I going to share it. The Novell client has several parts and there is no reliable 'silent' way to get rid of it that we found. The script we made worked in 80-90% of the cases, and the rest required some kind of manual intervention.
Since then our biggest trouble spot has been the AD login script, concurrency issues mostly, but otherwise we haven't had much in the way of user or technical protest after the move. I didn't touch on printing issues here since 98% of our printing is from the student labs and that's a specialized environment. That migrated in the August/September 2009 timeframe and worked very well for a fundamentally new environment.

Update 1/5/2011: Forgot about perhaps the biggest problem, user and password sync. Also added Mac information.

5 Comments

Many thanks, great info! Now I'm off to threaten our mail filter provider again....

How did you handle AFP access for macs? considering Netware has this functionality.

Yeah aren't those archaic libraries (like Samba) on old OS X releases so much fun? We would have been forced to upgrade to 10.6 for just that particular Samba issue. Guess what that would have also meant? All new Macs, because we still had PowerPC Macs. That was all a bit of money ($$$), so we finally convinced them to switch to Windows. The hardware was cheaper, and we already had MS cal's for them.

Hey, thank you for this very helpful info!

We will migrate also very soon our 1.5 TB data from the NW6.5 to a new virtual 2008 R2 server and I guess we have to do exactly the same steps. But we have already built up an AD since years with the equivalent user credentials, the big problem will be just to link the eDir and AD objects to move the file permissions on the Win server.

My big question is also how you managed all the files with UNC hyperlinks inside? We have tons of Word and Excel files with the a UNC path link to another file. Would it work if we name the new Windows server like the old Netware server and build up the directory structure exactly the same and start with a Root directory called VOL1 (Volume name from Netware)? Would all the UNC path then still work?
The other solution would be to find a tool we can rename all the UNC links in all these files and redirect to the new severname.

How did you handle that?