I started to learn new programming language Swift. it’s great!
-
New programming language
-
Problem with booting SmartOS
Recently we have had a problem with booting SmartOS.
Valentin Zaretsky described the following issue:
« SmartOS hang strangely: smartos itself, native VM’s and KVM’s continued responding to ping on their IP’s but nothing else worked.
After hardware restart I cannot login to system: after getting root password it waits for something and does not show shell prompt. VM’s are not running. But network interface comes up, ssh prints banner «SSH-2.0-Sun_SSH_1.5» and the same way as on console hangs after getting password from user. on client ssh -v stops on the following: debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<3072<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
When I boot with noimport=true, I’m able to login with default password and able to do zpool import zones. And pool seems to be in normal healthy status System is rather old — 20131128T230213Z but had no problems all the time running so I did not upgrade it. »
Keith Wesolowski gave us the following advice:
« Most, but not all, instances like this where the system seems ok until you try to actually log in or do something with it are actually caused by problems in the disk subsystem. These problems may be transient or persistent, and they may be caused by software bugs or by hardware or firmware issues; the latter are more common. When you boot with noimport and then import, can you subsequently enable all services and then ssh in? What does fmadm faulty show you? If nothing, are there errors occurring that are precursors to fault diagnosis? You can find that out via fmdump -e. Anything in the logs (you’ll need to import the pool first to read them, which is also the case with the FMA data). Failing all of that, I would recommend booting with -m milestone=none. You should be able to log in using the *platform* default root password (which is not the same as the one you set at setup time). From there, you should be able to set up DTrace probes to monitor the progress of startup, then do ‘svcadm milestone all’ to start all the services. DO NOT LOG OUT OF THE CONSOLE! You will need it to monitor and debug the problem.
If all services (except of course console-login) seem to come up normally, you can then use your favourite tools — DTrace, truss, mdb, etc. — to debug the sshd server when you try to log in. You’ll likely need to iterate a few times to narrow your search for the problem as your understanding improves. This is a naive brute-force approach to debugging that almost always yields progress of some kind, even if it’s negative progress. If you can’t learn anything at all this way, a last-ditch option (which likely won’t work if the problem is with the disks or HBA) is to generate an NMI, which will cause the system to panic and create a crash dump. If you then boot and import the pool, you should be able to run savecore to grab the dump, which can then be analysed to better understand why things were hanging. How to generate an NMI is hardware-specific, and most desktop or consumer-type systems don’t support it. Among those that do, the most common way is to issue the IPMI ‘chassis power diag’ command remotely using ipmitool. We ship this tool, and it’s widely available on all POSIX-type OSs. If your system doesn’t have a BMC, or that doesn’t work, consult your vendor-supplied docs. »
We have yet to check everything that he advised but anyway now we know much more interesting things about SmartOS booting process than we have ever known.
-
Trust me, i’m an engineer
-
SmartOS zone that will serve up SmartOS
PXE Booting SmartOS from SmartOS zone
Motivation
We’ve bought a new Supermicro server – chassis and four blades. The provider installed Ubuntu on one of them, and from this I have already set up SmartOS on three other blades. As you know, host machine running on SmartOS boots from PXE server. But I don’t need a separate blade running on Linux, so to ensure safety I decided that each blade could be used as a loader for the rest of them. It was possible to deploy Linux on each host in KVM, but I found a better solution – to deploy PXE server in native SmartOS zone. Isn’t that wonderful when SmartOS can boot SmartOS?
Here’s how to set up a simple PXE server in a SmartOS zone that will serve up SmartOS
123imgadm updateimgadm import 8639203c-d515-11e3-9571-5bf3a74f354fcreate pxe-server.json with following:
Zone Configuration12345678910111213141516171819202122232425{“alias”: “pxe-server”,“hostname”: “pxe-server”,“brand”: “joyent”,“max_physical_memory”: 64,“quota”: 2,“image_uuid”: “8639203c-d515-11e3-9571-5bf3a74f354f”,“resolvers”: [“8.8.8.8”,“8.8.4.4”],“nics”: [{“nic_tag”: “admin”,“ip”: “192.168.0.2”,“netmask”: “255.255.255.0”,“gateway”: “192.168.0.1”,“dhcp_server”: “1”}]}vmadm create -f pxe-server.jsonSetting up TFTP
Use zlogin to log into the zone:
1zlogin <uuid>In the zone:
12345678910111213pkgin -y install tftp-hpamkdir /tftpbootecho “tftp dgram udp wait root /opt/local/sbin/in.tftpd in.tftpd -s /tftpboot” > /tmp/tftp.inetdsvcadm enable inetdinetconv -i /tmp/tftp.inetd -o /tmpsvccfg import /tmp/tftp-udp.xmlsvcadm restart tftp/udpSetting up DHCP (using Dnsmasq)
1pkgin -y install dnsmasqEdit /opt/local/etc/dnsmasq.conf
123456789dhcp-range=192.168.0.200,192.168.0.220,2hdhcp-match=set:gpxe,175dhcp-boot=tag:!gpxe,undionly.kpxedhcp-boot=smartos.ipxedhcp-leasefile=/etc/dnsmasq.leasessvcadm enable dnsmasqSetting up the tftpboot directory
Ben Rockwood provides a version of undionly.kpxe on his site. Run the following to get the PXE chainload binaries in place:
123cd /tftpbootcurl http://cuddletech.com/IPXE-100612_undionly.kpxe > undionly.kpxeAt this point a generic PXE boot server is complete. iPXE will still expect smartos.ipxe, but that can be created with whatever content is needed. For those interested in booting SmartOS, what follows are the steps to provide SmartOS boot services on this server.
Providing SmartOS PXE Boot Services
A template iPXE config is useful both upfront and when updating to new platform releases. Create /tftpboot/smartos.ipxe.tpl with the following content (-B smartos=true is essential, otherwise logins will fail):
1234567891011#!ipxe# /var/lib/tftpboot/smartos.ipxe.tplkernel /smartos/$release/platform/i86pc/kernel/amd64/unix -B smartos=trueinitrd /smartos/$release/platform/i86pc/amd64/boot_archivebootcd /tftpbootmkdir smartosDeploy/Update to the latest SmartOS platform release
The steps in this section work for both initial deployment and upgrades as Joyent releases them.
Next get the latest SmartOS platform and massage it into a workable shape for our iPXE config:
123cd /tftpboot/smartoscurl https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/platform-latest.tgz > /var/tmp/platform-latest.tgz(Just now URL https://download.joyent.com/pub/iso/platform-latest.tgz is invalid, 404… )
1234567891011121314151617cat /var/tmp/platform-latest.tgz | tar xzdirectory=`ls | grep platform- | sort | tail -n1`release=${directory:9}mv $directory $releasecd $releasemkdir platformmv i86pc platformcd /tftpbootcat smartos.ipxe.tpl | sed -e”s/\$release/$release/g” > smartos.ipxeMake sure PXE boot is enabled and that it is the first in the boot sequence.
Thanks
Thanks to Alain O’Dea for his notes about his experience in setting up Ubuntu Server 12.04.1 LTS as a PXE server to boot SmartOS and big thanks to Ben Rockwood for creating and maintaining the PXE Booting SmartOS wiki page. Without their instructions I would not have done it.
Enjoy and stay tuned!
-
Happy SysAdmin Day!